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(57) Abstract: The present invention relates to the identification of genes which have been disrupted in patients diagnosed as suf- 
fering from schizophrenia and/or bi-polar affective disorder, as well as proteins encoded by the gene and antibodies thereto and to 
uses of such products as medicaments for treating schizophrenia and/or affective psychosis. The invention also relates to methods 
for diagnosing patients suffering or predisposed to schizophrenia and/or affective psychosis, as well as screens for developing novel 
treatment regimes for schizophrenia and/or affective psychosis. 



WO 03/087408 




PCT/GB03/01543 



SCHIZOPHRENIA ASSOCIATED GENES 
The present invention relates to the identification of 
genes which have been disrupted in patients diagnosed as 
suffering from schizophrenia and/or bi-polar affective 
disorder, as well as proteins encoded by the gene and 
antibodies thereto and to uses of such products as 
medicaments for treating schizophrenia and/ or affective 
psychosis. The invention also relates to methods for 
diagnosing patients suffering or predisposed to 
schizophrenia and/or affective psychosis, as well as 
screens for developing novel treatment regimes for 
schizophrenia and/or affective psychosis. 

Schizophrenia and Bipolar Affective Disorder are 
common and debilitating psychiatric disorders. Despite a 
wealth of information on the epidemiology, neuroanatomy and 
pharmacology of the illness, it is uncertain what molecular 
pathways are involved and how impairments in these affect 
brain development and neuronal function. Despite an 
estimated heritability of 60-80%, very little is known 
about the number or identity of genes involved in these 
psychoses. Although there has been recent progress in 
linkage and association studies, especially from genome- 
wide scans, these studies have yet to progress from the 
identification of susceptibility loci or candidate genes to 
the full characterisation of disease-causing genes 

(Berrettini, 2000) . 

The cloning of breakpoints in patients with chromosome 
abnormalities (translocations, inversions etc.) has proved 
instrumental in the identification of many disease genes 

(e.g. Duchenne Muscular Dystrophy, Retinoblastoma, Wilm's 
Tumour, Familial Polyposis Coli, Fragile-X Syndrome, 
Polycystic Kidney Disease, many leukaemias and, very 
recently, a candidate speech and language disorder gene 

(Lai et al, 2001)). Such studies assume that the 
chromosomal breakpoints give rise to the clinical symptoms 
by either directly disrupting gene sequences or perturbing 
gene expression. In the same way that gene-trap 
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mutagenesis can be used to identify disrupted mouse genes 
(Brennan & Skarnes, 1999) , the physical "flag 11 created by 
a cytogenetic breakpoint provides a geographical pointer 
for the disease locus. 

It is amongst the objects of the present invention to 
provide genes and/or proteins postulated to be involved 
with the development and /or symptoms associated with 
schizophrenia and/or affective psychosis. 

As will be seen, the present invention is based on 
the molecular characterisation of a chromosomal disruption 
in subjects diagnosed as suffering from a schizophrenia 
and/or affective psychosis. A high-throughput Fluorescence 
in situ Hybridisation (FISH) -based approach has been 
adopted to map the chromosomal breakpoints in these 
patients. Consultation of the sequence data at the 
breakpoint locus not only allows efficient FISH probe 
selections to be made by the targeting of coding regions, 
but also proof of gene disruption can be made entirely by 
relating the exact position of probes to the genomic 
structure of a candidate gene. 

Four patients have been studied and their chromosomal 
disruptions characterised. Hereinafter the patients will 
be identified as patients 1-4. 

As will be seen, in one embodiment, the present 
invention is based on the molecular characterisation of a 
chromosomal rearrangement denoted t (3 ;8) (pl3 ;p22) in a 
subject (patient 1) diagnosed as suffering from a 
schizoaffective disorder (see Fig.l). A high-throughput 
Fluorescence in situ Hybridisation (FISH) -based approach 
was adopted to map the chromosomal breakpoints in these 
patients. Consultation of the sequence data at the 
breakpoint loci not only allowed efficient FISH probe 
selections to be made by the targeting of coding regions, 
but also proof of gene disruption was inferred entirely by 
relating the exact position of probes to the genomic 
structure of a candidate gene. 
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One breakpoint (located on chromosome 8p22) in this 
subject lies near to a gene, N33, involved in the N-Linked 
Glycosylation pathway. 

This pathway consists of three stages. Firstly the 
assembly of a donor oligosaccharide at the endoplasmic 
reticulum lumen membrane. Secondly, the transfer of this 
molecule onto newly translated secretory and transmembrane 
proteins catalyzed by the oligosacchary ltransf erase (OST) 
complex. Thirdly, there is subsequent modification of the 
oligosaccharides on the glycoprotein. N33 encodes a 
protein thought to be involved in the second stage of the 
pathway by analogy with yeast homologues. Without wishing 
to be bound by theory it is hypothesised that the 
breakpoint in the subject perturbs N33 expression 
indirectly through position effect silencing or separation 
of regulatory elements from the gene promoter (both effects 
have been shown to occur even when the breakpoints are up 
to 1Mb from the target gene in some instances (Kleinjan et 
al 1999) ) . 

As the N33 gene is located within a chromosomal 
region repeatedly found positive in schizophrenia linkage 
studies the present inventors pursued this gene further by 
association study . 

Certain microsatellite repeat haplotypes have been 
identified at the N3 3 locus which are over-represented in 
schizophrenic patients and their families compared to the 
normal population. Subsequent sequencing of the N33 gene 
in haplotype carrying individuals is ongoing in order to 
identify causative mutations. 

The other breakpoint in this patient (3pl3) has now 
also been fully characterised and demonstrated to disrupt 
a gene, SEMCAP3 (also known as KIAA109 5) . The present 
invention is therefore also based on a proposed role of 
this gene (normal and mutated forms) in the aetiology of 
schizophrenia and/or affective psychosis. 
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In a further embodiment the present invention is 
based on the GRIK4 gene and observations of the present 
inventors of an involvement of this gene and/or protein 
with schizophrenia and/or affective psychosis . 

The GRIK4 gene is also known as KA1 and EAA1 , but will 
herein be referred to as GRIK4 for simplicity, but should 
not be construed as limiting. 

The subject (patient 2) was one of a series of around 
100 patients with comorbid schizophrenia and mild learning 
disability (US terminology: "mental retardation") who were 
screened using routine G-band karyotyping. This patient 
possesses a complex chromosomal rearrangement which can be 
described by standard nomenclature as; (46, XX, ins (8; 11) 
(ql3;q23 . 3q24 . 2) inv(2) (pl2q3 2. l)t(2;ll) (q2 1 . 3 ;q24 • 2 ) der 
(2) (2qter->2q32.1: : 2 pl2 -> 2q2 1 . 3 : : 1 1 q 2 4 . 2 - 
>llqter)der (11) (llpter->llq2 3 . 3 : :2q21. 3->2q3 2.1: :2pl2- 
>2pter)der(8) (8pter->8ql3 : : llq23 . 2->llq24 .2: : 8ql3->8qter) ) . 
It has been repeatedly observed that schizophrenia occurs 
more frequently in individuals with mild learning 
disability than in the general population and recent work 
has revealed an increased heritability of this comorbid 
state. 

As described herein the FISH results reveal that the 
subject has a disruption in a brain expressed gene; namely, 
GRIK4 which is known to participate in molecular mechanisms 
responsible for modulating the strength of synaptic 
transmission. 

In a further embodiment the present invention is based 
on the characterisation of a balanced reciprocal 
translocation between chromosomes 9 and 14, 
t (9;14) (q34 ;ql3) in a mother (patient 3) with schizophrenia 
and her daughter with schizophrenia co-morbid with mild 
learning disability. A brain transcription factor gene, 
NPAS3 , is shown to be disrupted by the translocation at 
14ql3. Without wishing to be bound by theory, the present 
inventors hypothesis is that the disruption of this gene is 
responsible for the psychotic symptoms exhibited by the 
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mother and daughter. 

As will be seen, the present invention is also based 
on the molecular characterisation of a chromosomal 
rearrangement denoted t ( 1 ; 16) (p3 1 . 2 ;q2 1) (in patient 4). 

The proband met ICD-10 and DSM-IV criteria for 
definite schizophrenia. The translocation was inherited 
within other branches of the family with variable clinical 
expression. However some key translocation carriers of the 
subjects to whom the inventors had access had not passed 
the age of risk when clinically characterized. 

One breakpoint (located on chromosome lp31.2) in 
patient 4 lies within an alternatively spliced form of the 
gene, PDE4B, involved in the attenuation of cAMP secondary 
messenger signaling . 

The remaining breakpoint in this patient (16q21) has 
now also fully characterised and demonstrated to disrupt a 
gene, CADHERIN 8 (CDH8) . The present invention is therefore 
based in part on a proposed role of this gene in the 
aetiology of schizophrenia and/or affective psychosis. 

In a first aspect the present invention provides use 
of a polynucleotide fragment or fragments comprising 
SEMCAP3 , N33, NPAS3 , GRIK4 , PDE4B and /or CDH8 gene(s) or a 
fragment (s) , derivative (s) or homologue(s) thereof for the 
manufacture of a medicament for treating schizophrenia 
and/or affective psychosis in a subject. 

In another aspect the present invention provides use 
of a polypeptide fragment or fragments encoded by SEMCAP3 , 
N33, NPAS3 , GRIK4 , PDE4B and/or CDH8 genets), or a 
fragment (s) , derivative (s) or homologue(s) thereof for the 
manufacture of a medicament for treating schizophrenia 
and/or affective psychosis in a subject. 

Schizophrenia and/or affective psychosis as used 
herein relates to schizophrenia, as well as other affective 
psychoses such as those listed in "The ICD-10 
Classification of Mental and Behavioural Disorders" World 
Health Organization, Geneva 1992. Categories F20 to F29 
inclusive includes Schizophrenia, schizotypal and 



WO 03/087408 




PCT/GB03/01543 



6 

delusional disorders. Categories F30 to F39 inclusive are 
Mood (affective) disorders that include bipolar affective 
disorder and depressive disorder. Mental Retardation is 
coded F70 to F79 inclusive. The Diagnostic and Statistical 
Manual of Mental Disorders, Fourth Edition (DSM-IV) . 
American Psychiatric Association, Washington DC. 1994, 
Include all conditions coded 295. xx (Schizophrenia and 
Other Psychotic Disorders) and 296. xx (Major Depressive 
Disorders and Bipolar Disorders) . Mental Retardation is 
coded 315, 317, 318 and 319. 

SEMCAP3 has been previously cloned and sequenced in 
mouse as two alternative forms (Semcap3A and 3B) and the 
sequences are present in the public database (nucleic acid 
sequences; AF127084 / AF127085 , respectively; protein 
sequences AAF2213 1/AAF22132 , respectively) as directly 
submitted by Wang & Strittmatter , 1999. The human form of 
the gene is defined by sequence KIAA1095 (nucleic acid 
sequence, AB029018 or XM_041363, and a smaller form, 
BC014432; protein sequence, XP_041363). The genomic 
sequences corresponding to this gene are also present in 
the public database (eg. for BAG RPll-252olO, AC024102) . 
Nevertheless, the prior art does not suggest any link 
between SEMCAP3 and schizophrenia and/or affective 
psychosis. 

Thus, references herein to the SEMCAP3 gene are 
understood to relate to the sequences in the public 
databases and identified in Fig. 3 and references to the 
SEMCAP3 protein sequence is understood to relate to the 
sequences in the public databases and identified in Fig. 4. 

N33 has been previously cloned and sequenced and the 
sequence is present in the public database (Nucleic acid 
sequence; U42349, Protein sequence; Q13454) and described 
in MacGrogan et al, 1996. The genomic sequences 

corresponding to this gene are also present in the public 
database (eg. for BAC RPll-23jl4) but some SNP 
polymorphisms or sequencing errors (eg. an extra M C" 
present in exon lb, see hereinafter - cctgcccCaccggg - may 
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result in differences to the sequences presented herein. 
Nevertheless, the prior art does not suggest any link 
between N33 and schizophrenia and affective psychosis. 

In addition to the sequences previously identified, 
the present inventors have identified a new start exon (la, 
see Figures 6 and 7) and have observed the complexity of 
the exon splicing at the 3 1 end of the gene (see Figures 6 
and 7) . 

Thus, references herein to the N33 gene are understood 
to relate to the sequences in the public databases and 
identified in Figures 6 and 7 and references to the N33 
protein sequence are understood to relate to the sequences 
in the public databases and identified in Figures 6 and 7. 

The GRIK4 gene is located on chromosome 11, at 
cytogenetic position llq22.3. The gene encodes a kainate 
receptor subunit and has been previously described by 
Kamboj et al, 1994, The cDNA nucleotide sequence and 
peptide sequence was disclosed by Kamboj et al, 1994 and 
submitted to the Genbank/EMBL database under accession 
NM_014619. The coding sequence of the gene is identified 
as being 2871 nucleotides in length, coding for a protein 
957 amino acids. The nucleotide and protein sequences are 
shown in Figures 10 and 11 respectively. The present 
inventors have identified an alternative start site for the 
gene (see Figures 15-17) which would result in a shorter 
gene/protein of 933 amino acids as opposed to 956. The 
full nucleotide sequence and protein sequence of this 
alternatively encoded gene/protein is shown in Figures 16 
and 17. 

Thus, references herein to the GRIK4 gene are 
understood to relate to the sequences identified in Figures 

10 and 16 and references to the GRIK4 protein sequence are 
understood to relate to the sequences identified in Figures 

11 and 17. 

The human form of NPAS3 has previously been identified 
and is found in the public database under accession numbers 
AB054575 and AF164438, with the differences due to 
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alternative splicing and all forms are encompassed within 
the present invention. 

Thus, references herein to the NPAS3 gene are 
understood to relate to the sequences identified in Figures 

18 and 20 and references to the NPAS3 protein sequence are 
understood to relate to the sequences identified in Figures 

19 and 21. 

The PDE4B gene is located on chromosome 1 at 
cytogenetic position lp31.2. The gene encodes a 

phosphodiesterase which shows homology to the Dunce leaning 
and memory gene product of Drosophila melanogaster , Bolger 
et al, 1993. Two long (PDE4B1 and PDE4B3) and one short 
(PDE4B2) splice form are described herein. There is a core 
protein sequence of 525 amino acid residues shared by all 
three forms. On to this is added 39 N-terminal amino acid 
residues in the case of PDE4B2. Both of the long forms 
share an additional central stretch of 118 amino acid 
residues, but then diverge at the N-terminal end of the 
proteins; PDE4B1 has 93 specific residues and PDE4B3 , 78. 
It is predicted that only the PDE4B1 splice form (brain 
expressed) may be disrupted by the chromosomal abnormality 
observed in the patient and family. 

Thus, references herein to the PDE4B gene are 
understood to relate to the sequences identified in Figures 
25, 27 and 29 and references to the PDE4B protein sequence 
are understood to relate to the sequences identified in 
Figures 26, 28 and 30. 

CADHERIN 8 ( CDH8 ) has been previously cloned and 
sequenced and the sequence is present in the public 
database (nucleic acid sequence; L34060/AB035305/NM_001796 , 
protein sequence; NP_001787) and described in Suzuki et 
al., 1991, Tanihara et al . , 1994, and Shimoyama et al . , 
2 000. An alternative transcript form has been described in 
the rat in which there is a truncation within the 5 th 
cadherin domain (Kido et al., 1998 and see Fig. 4). The 
accession numbers for the normal and truncated forms of 
CDH8 in rat are AB010436 and AB010437, respectively. The 
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corresponding human truncated transcript is not present in 
the public database and so is not yet confirmed. The 
genomic sequences corresponding to CDH8 are also present in 
the public database (eg. BAC CTC-420A11; AC040161) . 
Nevertheless, the prior art does not suggest any link 
between CDH8 and schizophrenia and/or affective psychosis. 

Thus, references herein to the CDHS gene are 
understood to relate to the nucleic sequences in the public 
databases and identified in Fig. 35 and references to the 
CDH8 protein sequences are understood to relate to the 
sequences in the public databases and identified in Fig. 36. 

In certain jurisdictions claims to methods of 
treatment are permissible and so the skilled reader will 
appreciate that the/said SEMCAP3 , N33 , GRIK4 , NPAS3 , PDE4B 
and/or CDH8 gene(s), or fragment(s), derivative (s) or 
homologue(s) thereof; or SEMCAP3 , N33, GRIK4, NPAS3 , PDE4B 
and/or CDH8 protein, or functionally active fragment ( s) , 
derivative (s) , or homologue(s) thereof, may be administered 
to an individual as a method of treating an individual with 
schizophrenia and/or affective psychosis. 

"Polynucleotide fragment" as used herein refers to a 
chain of nucleotides such as deoxyribose nucleic acid (DNA) 
and transcription products thereof, such as RNA. 
Naturally, the skilled addressee will appreciate the whole 
naturally occurring human genome is not included in the 
definition of polynucleotide fragment. 

The polynucleotide fragment can be isolated in the 
sense that it is substantially free of biological material 
with which the whole genome is normally associated in vivo. 
The isolated polynucleotide fragment may be cloned to 
provide a recombinant molecule comprising the 
polynucleotide fragment. Thus, "polynucleotide fragment 
includes double and single stranded DNA, RNA and 
polynucleotide sequences derived therefrom, for example, 
subsequences of said fragment and which are of any 
desirable length. Where a nucleic acid is single stranded 
then both a given strand and a sequence or reverse 
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complementary thereto is within the scope of the present 
invention . 

In general, the term "expression product" or "gene 
product" refers to both transcription and translation 
products of said polynucleotide fragments. When the 
expression or gene product is a "polypeptide" (i.e. a chain 
or sequence of amino acids displaying a biological activity 
substantially similar (eg. 98%, 95%, 90%, 80%, 75% 
activity) to the biological activity of the protein) , it 
does not refer to a specific length of the product as such. 
Thus, the skilled addressee will appreciate that 
"polypeptide" encompasses inter alia peptides, polypeptides 
and proteins. The polypeptide if required, can be modified 
in vivo and in vitro, for example by glycosylation, 
amidation, carboxylation , phosphorylation and/or post- 
trans lational cleavage. 

The present invention further provides a recombinant 
or synthetic polypeptide for the manufacture of reagents 
for use as therapeutic agents in the treatment of 
schizophrenia and/or affective psychosis. In particular, 
the invention provides pharmaceutical compositions 
comprising the recombinant or synthetic polypeptide 
together with a pharmaceutical ly acceptable carrier 
therefor. 

The present invention further provides an isolated 
polynucleotide fragment capable of specifically hybridising 
to a related polynucleotide sequence from another species. 
In this manner, the present invention provides probes 
and/or primers for use in ex vivo and/or in situ detection 
and expression studies. Typical detection studies include 
polymerase chain reaction (PCR) studies, hybridisation 
studies, or sequencing studies. In principle any specific 
polynucleotide sequence fragment from the identified 
sequences may be used in detection and/or expression 
studies. The skilled addressee understands that a specific 
fragment is a fragment of the sequence which is of 
sufficient length, generally greater than 10, 12, 14, 16 or 



WO 03/087408 




PCT/GB03/01543 



I 1 

20 nucleotides in length, to bind specifically to the 
sequence, under conditions of high stringency, as defined 
herein, and not bind to unrelated sequences, . that is 
sequences from elsewhere in the genome of the organism 
other than an allelic form of the sequence or non- 
homologous sequences from other organisms. 

"Capable of specifically hybridising" is taken to mean 
that said polynucleotide fragment preferably hybridises to 
a related or similar polynucleotide sequence in preference 
to unrelated or dissimilar polynucleotide sequences • 

The invention includes polynucleotide sequence (s) 
which are capable of specifically hybridising to an 
polynucleotide fragment as described herein or to a part 
thereof without necessarily being completely complementary 
or reverse complementary to said related polynucleotide 
sequence or fragment thereof. For example, there may be at 
least 50%, or at least 75%, at least 90%, or at least 95% 
complementarity. Of course, in some cases the sequences 
may be exactly reverse complementary (100% reverse 
complementary) or nearly so (e.g. there may be less than 
10, typically less than 5 mismatches) . Thus, the present 
invention also provides anti-sense or complementary 
nucleotide sequence (s) which is/are capable of specifically 
hybridising to the disclosed polynucleotide sequence. If 
a specific polynucleotide is to be used as a primer in PCR 
and/or sequencing studies, the polynucleotide must be 
capable of hybridising to related nucleic acid and capable 
of initiating chain extension from 3 1 end of the 
polynucleotide, but not able to correctly initiate chain 
extension from unrelated sequences. 

If a polynucleotide sequence of the present invention 
is to be used in hybridisation studies to obtain or 
identify a related sequence from another organism the 
polynucleotide sequence should preferably remain hybridised 
to a sample polynucleotide under, stringent conditions. If 
desired, either the test or sample polynucleotide may be 
immobilised. Generally the test polynucleotide sequence is 
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at least 10, 14, 20 or at least 50 bases in length. It may 
be labelled by suitable techniques known in the art. 
Preferably the test polynucleotide sequence is at least 200 
bases in length and may even be several kilobases in 
length. Thus, either a denatured sample or test sequence 
can be first bound to a support. Hybridization can be 
effected at a temperature of between 50 and 70 # C in double 
strength SSC (2xNaCl 17.5g/l and sodium citrate (SC) at 
8.8g/l) buffered saline containing 0.1% sodium dodecyl 
sulphate (SDS) . This can be followed by rinsing of the 
support at the same temperature but with a buffer having a 
reduced SSC concentration. Depending upon the degree of 
stringency required, and thus the degree of similarity of 
the sequences, such reduced concentration buffers are 
typically single strength SSC containing 0.1%SDS, half 
strength SSC containing 0 . 1%SDS and one tenth strength SSC 
containing 0.1%SDS. Sequences having the greatest degree 
of similarity are those the hybridisation of which is least 
affected by washing in buffers of reduced concentration. 
It is most preferred that the sample and inventive 
sequences are so similar that the hybridisation between 
them is substantially unaffected by washing or incubation 
in standard sodium citrate (0.1 x SSC) buffer containing 
0. 1%SDS. 

Oligonucleotides may be designed to specifically 
hybridise to N3 3 SEMCAP3, NPAS3, GRIK4 , PDE4B and/or CDH8 
nucleic acid. They may be synthesised, by known techniques 
and used as primers in PCR or sequencing reactions or as 
probes in hybridisations designed to detect the presence of 
a mutated or normal N33 , SEMCAP3 , NPAS3 , GRIK4 , PDE4B 
and/or CDH8 gene(s) in a sample. The oligonucleotides may 
be labelled by suitable labels known in the art, such as, 
radioactive labels, chemi luminescent labels or fluorescent 
labels and the like. 

The term "oligonucleotide" is not meant to indicate 
any particular length of sequence and encompasses 
nucleotides of preferably at least 10b (e.g. 10b to lkb) in 
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length, more preferably 12b-500b in length and most 
preferably 15b to 100b. 

The oligonucleotides may be designed with respect to 
any of the sequences described herein and may be 
manufactured according to known techniques. They may have 
substantial sequence identity (e.g. at least 50%, at least 
75%, at least 90% or at least 95% sequence identity) with 
one of the strands shown therein or an RNA equivalent, or 
with a part of such a strand. Preferably such a part is at 
least 10 , at least 30, at least 50 or at least 200 bases 
long. It may be an open reading frame (ORF) or a part 
thereof . 

Oligonucleotides which are generally greater than 30 
bases in length should preferably remain hybridised to a 
sample polynucleotide under one or more of the stringent 
conditions mentioned above. Oligonucleotides which are 
generally less than 30 bases in length should also 
preferably remain hybridised to a sample polynucleotide but 
under different conditions of high stringency. Typically 
the melting temperature of an oligonucleotide less than 30 
bases may be calculated according to the formula of; 2 # C 
for every A or T, plus 4 # C for every G or C, minus 5*C. 
Hybridization may take place at or around the calculated 
melting temperature for any particular oligonucleotide, in 
6 x SSC and 1% SDS. Non specifically hybridised 
oligonucleotides may then be removed by stringent washing, 
for example in 3 x SSC and 0.1% SDS at the same 
temperature. Only substantially similar matched sequences 
remain hybridised i.e. said oligonucleotide and 
corresponding test nucleic acid. 

When oligonucleotides of generally less than 30 bases 
in length are used in sequencing and/or PCR studies, the 
melting temperature may be calculated in the same manner as 
described above. The oligonucleotide may then be allowed 
to anneal or hybridise at a temperature around the 
oligonucleotides calculated melting temperature. In the 
case of PCR studies the annealing temperature should be 
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around the lower of the calculated melting temperatures for 
the two priming oligonucleotides. It is to be appreciated 
that the conditions and melting temperature calculations 
are provided by way of example only and are not intended to 
be limiting. It is possible through the experience of the 
experimenter to vary the conditions of hybridisation and 
thus anneal/hybridise oligonucleotides at temperatures 
above their calculated melting temperature. Indeed this 
can be desirable in preventing so-called non-specific 
hybridisation from occurring. 

It is possible when conducting PCR studies to predict 
an expected size or sizes of PCR product (s) obtainable 
using an appropriate combination of two or more 
oligonucleotides, based on where they would hybridise to 
the sequences described herein. If, on conducting such a 
PCR on a sample of DNA, a fragment of the predicted size is 
obtained, then this is predictive that the DNA encodes a 
homologous sequence from a test organism. 

Proteins for all the applications described herein can 
be produced by cloning the gene for example into plasmid 
vectors that allow high expression in a system of choice 
e.g. insect cell culture, yeast, animal cells, bacteria 
such as Escherichia coli. To enable effective purification 
of the protein, a vector may be used that incorporates an 
epitope tag (or other "sticky" extension such as His6) onto 
the protein on synthesis. A number of such vectors and 
purification systems are commercially available. 

The polynucleotide fragment can be molecularly cloned 
into a prokaryotic or eukaryotic expression vector using 
standard techniques and administered to a host. The 
expression vector is taken up by cells and the 
polynucleotide fragment of interest expressed, producing 
protein . 

It will be understood that for the particular 
polypeptides embraced herein, natural variations such as 
may occur due to polymorphisms, can exist between 
individuals or between members of the family. These 
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variations may be demonstrated by (an) amino acid 
difference (s) in the overall sequence or by deletions, 
substitutions, insertions, inversions or additions of (an) 
amino acid(s) in said sequence. All such derivatives 
showing the recognised activity are included within the 
scope of the invention. For example, for the purpose of 
the present invention conservative replacements may be made 
between amino acids within the following groups: 

(I) Alanine, serine, threonine; 

(II) Glutamic acid and aspartic acid; 

(III) Arginine and leucine; 

(IV) Asparagine and glutamine; 

(V) Isoleucine, leucine and valine; 

(VI) Phenylalanine, tyrosine and tryptophan 
Moreover, recombinant DNA technology may be used to 

prepare nucleic acid sequences encoding the various 
derivatives outlined above. 

As is well known in the art, the degeneracy of the 
genetic code permits substitution of bases in a codon 
resulting in a different codon which is still capable of 
coding for the same amino acid, e.g. the codon for amino 
acid glutamic acid is both GAT and GAA. Consequently, it 
is clear that for the expression of polypeptides from 
nucleotide sequences described herein or fragments thereof, 
use can be made of a derivative nucleic acid sequence with 
such an alternative codon composition different from the 
nucleic acid sequences shown in the Figures. 

The polynucleotide fragments of the present invention 
are preferably linked to regulatory control sequences. 
Such control sequences may comprise promoters, operators, 
inducers, enhancers, silencers, ribosome binding sites, 
terminators etc. Suitable control sequences for a given 
host may be selected by those of ordinary skill in the art. 

A polynucleotide fragment according to the present 
invention can be ligated to various expression controlling 
sequences, resulting in a so called recombinant nucleic 
acid molecule. Thus, the present invention also includes 
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an expression vector containing an expressible nucleic acid 
molecule. The recombinant nucleic acid molecule can then 
be used for the transformation of a suitable host. 

Specific vectors which can be used to clone nucleic 
acid sequences according to the invention are known in the 
art (e.g. Rodriguez, R.L. and Denhadt, D.T., Edit., 
Vectors: a survey of molecular cloning vectors and their 
uses, Butterworths, 1988, or Jones et al . , Vectors: Cloning 
Applications: Essential Techniques (Essential techniques 
series) , John Wiley & Son. 1998) . 

The methods to be used for the construction of a 
recombinant nucleic acid molecule according to the 
invention are known to those of ordinary skill in the art 
and are inter alia set forth in Sambrook, et al. (Molecular 
Cloning: a laboratory manual Cold Spring Harbour 
Laboratory, 1989) . 

The present invention also relates to a transformed 
cell containing the polynucleotide fragment in an 
expressible form. "Transformation" , as used herein, refers 
to the introduction of a heterologous polynucleotide 
fragment into a host cell. The method used may be any 
known in the art, for example, direct uptake, transfection 
transduction or electroporation (Current Protocols in 
Molecular Biology, 1995. John Wiley and Sons Inc.). The 
heterologous polynucleotide fragment may be maintained 
through autonomous replication or alternatively, may be 
integrated into the host genome. The recombinant nucleic 
acid molecules preferably are provided with appropriate 
control sequences compatible with the designated host which 
can regulate the expression of the inserted polynucleotide 
fragment, e.g. tetracycline responsive promoter, thymidine 
kinase promoter, SV-40 promoter and the like. 

Suitable hosts for the expression of recombinant 
nucleic acid molecules may be prokaryotic or eukaryotic in 
origin. Hosts suitable for the expression of recombinant 
nucleic acid molecules may be selected from bacteria, 
yeast, insect cells and mammalian cells. 
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In another aspect the present invention also relates 
to a method of diagnosing schizophrenia and/or affective 
psychosis or susceptibility to schizophrenia and/or 
affective psychosis in an individual, wherein the method 
comprises determining if SEMCAP3 , N33 , GRIK4 , NPAS3 , PDE4B 
and/or CDH8 gene(s) in the individual has been disrupted by 
a mutation or chromosomal rearrangement. 

The methods which may be employed to elucidate such a 
mutation or chromosomal rearrangement are well known to 
those of skill in the art and could be detected for example 
using PCR or in hybridisation studies using suitable probes 
which could be designed to span an identified mutation site 
or chromosomal breakpoint in close proximity to the/said 
N33 S EM CAP 3 , NPAS3 , GRIK4, PDE3B and/or CDH8 gene(s), such 
as the breakpoint identified by the present inventors and 
described herein. 

Once a particular polymorphism or mutation has been 
identified it may be possible to determine a particular 
course of treatment. For example it is known that some 
forms of treatment work for some patients, but not all. 
This may in fact be due to mutations in the/said N33, 
SEMCAP3 , NPAS3 , GRIK4 , PDE4B and/ or CDH8 gene(s) or 
surrounding sequence, and it may therefore be possible to 
determine a treatment strategy using current therapies, 
based on a patient's genotype. 

It will be appreciated that mutations in the gene 
sequence or controlling elements of a gene, eg. a promoter 
and/or enhance can have subtle effects such as affecting 
mRNA splicing/stability/activity and/or control of gene 
expression levels, which can also be determined. Also the 
relative levels of RNA can be determined using for example 
hybridisation or quantitative PCR as a means to determine 
if the/said SEMCAP3 , N33 , GRIK4 , NPAS3 , PDE4B and/or CHD8 
gene(s) has been disrupted. 

Moreover the presence and/or levels of the/said 
SEMCAP3 , N33, GRIK4 , NPAS3 , PDE4B and/or CHD8 gene(s) 
products themselves can be assayed by immunological 
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techniques such as radioimmunoassay, Western blotting and 
ELISA using specific antibodies raised against the gene 
products. The present invention also therefore relates to 
antibodies specific for a SEMCAP3 , N33 , GRIK4 , NPAS3 , PDE4B 
and/or CHD8 gene(s) product (s) and uses thereof in 
diagnosis and/or therapy. 

A further aspect of the present invention therefore 
provides antibodies specific to the polypeptides of the 
present invention or epitopes thereof. Production and 
purification of antibodies specific to an antigen is a 
matter of ordinary skill, and the methods to be used are 
clear to those skilled in the art. The term antibodies can 
include, but is not limited to polyclonal antibodies, 
monoclonal antibodies (mAbs) , humanised or chimeric 
antibodies, single chain antibodies, Fab fragments, F(ab , ) 2 
fragments, fragments produced by a Fab expression library, 
anti-idiotypic (anti-Id) antibodies, and epitope binding 
fragments of any of the above. Such antibodies may be used 
in modulating the expression or activity of the particular 
polypeptide, or in detecting said polypeptide in vivo or in 
vi tro . 

Using the sequences disclosed herein, it is possible 
to identify related sequences in other animals, such as 
mammals, with the intention of providing an animal model 
for psychiatric disorders associated with the improper 
functioning of the nucleotide sequences and proteins of the 
present invention. Once identified, the homologous 
sequences can be manipulated in several ways common to the 
skilled person in order to alter the functionality of the 
nucleotide sequences and proteins homologous to those of 
the present invention. For example, "knock-out" animals 
may be created, that is, the expression of the genes 
comprising the nucleotide sequences homologous to those of 
the present invention may be reduced or substantially 
eliminated in order to determine the effects of reducing or 
substantially eliminating the expression of such genes. 
Alternatively, animals may be created where the expression 
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of the nucleotide sequences and proteins homologous to 
those of the present invention are upregulated, that is, 
the expression of the genes comprising the nucleotide 
sequences homologous to those of the present invention may 
be increased in order to determine the effects of 
increasing the expression of these genes. In addition to 
these manipulations, substitutions, deletions and additions 
may be made to the nucleotide sequences encoding the 
proteins homologous to those of the present invention in 
order to effect changes in the activity of the proteins to 
help elucidate the function of domains, amino acids, etc. 
in the proteins. Furthermore, the sequences of the present 
invention may also be used to transform animals to the 
manner described above. The manipulations described above 
may also be used to create an animal model of schizophrenia 
and/or affective psychosis associated with the improper 
functioning of the nucleotide sequences and/or proteins of 
the present invention in order to evaluate potential agents 
which may be effective for combatting psychotic disorders, 
such as schizophrenia and/or affective psychosis. 

Thus, the present invention also provides for screens 
for identifying agents suitable for preventing and/or 
treating schizophrenia and/or affective psychosis 
associated with disruption or alteration in the expression 
of the SEMCAP3 , N33, CRIK4 , NPAS3 , PDE3B and/or CHDB gene 
and/or its gene products. Such screens may easily be 
adapted to be used for the high throughput screening of 
libraries of compounds such as synthetic, natural or 
combinatorial compound libraries. 

Thus, the/said SEMCAP3 , N33 , GRIK4 , NPAS3 , PDE4B 
and/or CDH8 gene(s) products according to the present 
invention can be used for the in vivo or in vitro 
identification of novel ligands or analogs thereof. For 
this purpose binding studies can be performed with cells 
transformed with nucleotide fragments according to the 
invention or an expression vector comprising a 
polynucleotide fragment according to the invention, said 
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cells expressing the/said SEMCAP3 , N33, GRIK4 , NPAS3 , PDE4B 
and/or CDH8 gene(s) products according to the invention. 

Alternatively also the/said SEMCAP3 , N33 , GRIK4 , 
NPAS3 , PDE4B and/or CDHB gene(s) products according to the 
invention as well as ligand-binding domains thereof can be 
used in an assay for the identification of functional 
ligands or analogs for the/ said SEMCAP3 , N33 , GRIK4 , NPAS3 , 
PDE4B and/ or CDH8 gene(s) products. 

Methods to determine binding to expressed gene 
products as well as in vitro and in vivo assays to 
determine biological activity of gene products are well 
known. In general, expressed gene product is contacted 
with the compound to be tested and binding, stimulation or 
inhibition of a functional response is measured. 

Thus, the present invention provides for a method for 
identifying ligands for SEMCAP3 , N33, GRIK4 , NPAS3 , PDE4B 
and/or CDHB gene(s) products, said method comprising the 
steps of: 

a) introducing into a suitable host cell a 
polynucleotide fragment according to the invention; 

b) culturing cells under conditions to allow 
expression of the polynucleotide fragment; 

c) optionally isolating the expression product; 

d) bringing the expression product (or the host cell 
from step b) ) into contact with potential ligands which 
will possibly bind to the protein encoded by said 
polynucleotide fragment from step a) ; 

e) establishing whether a ligand has bound to the 
expressed protein; and 

f) optionally isolating and identifying the ligand. 
As a preferred way of detecting the binding of the 

ligand to the expressed protein, also signal transduction 
capacity may be measured. 

Compounds which activate or inhibit the function of 
SEMCAP3, N33, GRIK4 , NPAS3 , PDE4B and/or CDH8 gene(s) 
products may be employed in therapeutic treatments to 
activate or inhibit the polypeptides of the present 
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invention . 

The present invention will now be further described by 
way of Example and with reference to the Figures which 
show: 

Figure 1 shows an ideogram diagram of the chromosomal 
rearrangement (a reciprocal translocation) in patient 1. 
The two breakpoints are marked at the approximate 
chromosomal locations at which they are located. In 
addition, and not to scale, the two candidate disease- 
causing genes, N33 and SEMCAP3 , are placed in the correct 
orientation and with respect to the breakpoints. 

Figure 2 shows a representation of the genomic 
structure of the SEMCAP3 gene: its spliced exons spread 
over a genomic extent of approximately 2 50kb. 
Above the gene, the coding contribution of each exon to the 
SEMCAP3 protein is indicated by bars and finely dashed 
lines. The domain structure of SEMCAP3 protein is shown at 
the top of the figure. 'RING' refers to a RING-finger 
domain, • ZF-T.' to a TRAF-type zinc finger (also referred 
to as a sina domain) and ' PDZ ' to PDZ domain present in 
PSD-95, Dig, and ZO-1/2. The BAC clones used to identify 
the breakpoint location are included at the bottom of the 
figure together with the inferred direction (arrows) of the 
breakpoint from the FISH results using these clones. The 
heavy dashed line shows the position of the breakpoint with 
respect to the gene exons and the domain structure of the 
protein. 

Figure 3 Nucleic acid sequence of Human SEMCAP3 
(genomic DNA sequence including CpG island/putative 
promoter upstream of 5 1 UTR/cDNA sequence is also included 
for clarity) . The following features are marked for 
clarity: 

a) ATG start site located at position 709 (underlined) 

b) GG bases (underlined) at the junction between exons 3 
and 4 (i.e. between which the breakpoint is located) 

c) UAA stop codon located at position 3907 (underlined) . 
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Figure 4 Amino acid sequence of Human SEMCAP3 with 
underlined regions of interest. 

a) Residues 18-55 Ring finger domain 

b) Residues 101-158 SINA/ZF-TRAF domain 

c) Residues 246-339 PDZ domain #1 

d) Residues 418-504 PDZ domain #2 

Figure 5 shows a schematic representation of the N33 
gene : exon splicing and chromosome breakpoint identified 
in the present invention. 

Figure 6 shows the nucleotide sequence of the various 
exons for N3 3. 

Figure 7 shows the various transcript options and 
associated amino acid sequences of the transcripts for N33; 

Figure 8 shows N3 3 protein aligned with other 
homologues. 

Figure 9 shows the effect of the C-terminus of the 
various N33 splice forms. The variety of splice forms at 
the 3 1 end of the gene has implications for the C-terminus 
of the protein. This is especially important when it is 
considered that N3 3 is likely to reside in the Golgi/ER 
compartment of the cell where C-termini are often involved 
in anchoring or trafficking proteins to different 
organelles. The light grey shading indicates putative 
transmembrane domains. Hence, only the spliceforms with 
exons la/lb, 2-6, 7,8,9, 10, 11 or la/lb, 2-6, 7 , 8 , 9 , 11 are likely 
to encode functional proteins and these will only differ in 
the extreme C-terminal residues. 

Figure 10 shows the published nucleotide sequence for 
GRIK4 . 

Figure 11 shows the published amino acid sequence for 
GRIK4 . 

Figure 12 Breakpoints identified in the subject 
(patient 2) . CEPH library YACs (Chumakov et al, 1992) 
spanning the breakpoints are listed. Also detailed are the 
BAC clones (and accession numbers) from the RPCI-ll BAG 
library (Osoegawa et al, 2 001) that span or flank (indicated 
by dashes) the breakpoints. Breakpoints at 8ql3 were not 
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characterised in this study. 

Figure 13 Representation of complex chromosomal 
rearrangement in the subject (patient 2) . The pericentric 
chromosome 2 inversion is coupled with a translocation to 
chromosome 11. The chromosome 11 region between the llq2 3 , 3 
and llq24.3 breakpoints is inserted on chromosome 8ql3. 

Figure 14 Genomic arrangements of the GRIK4 gene 
disrupted in the subject. Two potential GRIK4 transcripts 
with alternative start-sites are indicated. The la/ la 1 exons 
are derived from EST BE388730. The transcript 
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containing the lb exon corresponds to the published GRIK4 
sequence (acc. S67803) . It is probable that the present 
inventors exon "4" corresponds to a number of undefined 
exons which can only be subdivided after release of genomic 
sequence over this part of the gene. Hence, the actual 
number of GRIK4 transcript exons will most likely exceed 
14. BAC (grey boxes), cosmid (white boxes) and long-range 
PCR product (black line) derived FISH probes enabled the 
positioning of the breakpoint (arrows indicate the relative 
direction of the breakpoint deduced from the 
presence/absence of the signals on the two derived 
chromosomes) . Probes from BAC RPCI-11 89P5 and cosmids 
LA11197-C5, LA1163-H6, LA11236-G3 and LA1192-C6 indicated 
that the breakpoint was located near exons 2 and 3. A FISH 
probe synthesized from a long-range PCR product 
corresponding to the intronic sequence between these two 
exons indicated that the breakpoint lies upstream of the 
intron between exons 2 and 3 . 

Figure 15 5 1 sequence of the GRIK4 gene showing the 
two possible N-terminal peptides derived from alternate 
start sites. Exon combination la-la' -2 is derived from an 
EST sequence (acc. BE388730) . Exon combination lb-2 is 
based on the published cDNA sequence (e.g. acc. S67803) . 
The actual amino acid sequence may differ from the 
published amino acid sequence as there is a potential 
downstream methionine start (MVAC. . . instead of MPRV. . . ) 
containing a more conserved Kozak sequence (Kozak, 1986) . 
It can be seen that the breakpoint upstream of exon 2 will 
separate the majority of the coding sequence from the 
promoter resulting in a putative null allele. Exonic DNA 
sequence is shown in capitals, intronic or upstream 
sequence in lower case. Conserved splice junction 

sequences (EXON/GT AG /EXON) are underlined. Single 

letter amino acid codes are shown beneath the appropriate 
DNA codons. A functional C/G:Leu/Val single nucleotide 
polymorphism (underlined) is found within exon 2. 
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Figure 16 shows the complete alternative nucleic acid 
sequence as identified by the present inventors. 

Figure 17 shows the complete alternative amino acid 
sequence as identified by the present inventors. 

Figure 18 shows the nucleic acid sequence of NPAS3 
spliceform 1. 

Figure 19 shows the protein sequence of NFAS3 
spliceform 1. 

Figure 2 0 shows the nucleic acid sequence of NPAS3 
spliceform 2. 

Figure 21 shows the protein sequence of NPAS3 
spliceform 2. 

Figure 22 shows an ideogram representation of the 
balanced translocation in patient 3 relating to this 
invention. 

Figure 2 3 shows the genomic arrangement of the NPAS3 
gene including the position of the observed breakpoint. 

Figure 24 shows potential functional consequences of 
the disruption to NPAS3 gene : dominant-negative activity. 

Figure 25 shows the PDE4B1 nucleic acid sequence. 

Figure 2 6 shows the PDE4B1 protein sequence. 

Figure 27 shows the PDE4B3 nucleic acid sequence. 

Figure 28 shows the PDE4B3 protein sequence. 

Figure 29 shows the PDE4B2 nucleic acid sequence. 

Figure 30 shows the PDE4B2 protein sequence. 

Figure 31 a) Ideogram representation of balanced 
translocation between chromosomes 1 and 16 in patient 4. 

Figure 32 Genomic arrangements of the PDE4B gene 
disrupted in the subject (patient 4) . The two long 
transcripts of the PDE4B gene are shown. FISH showed the 
breakpoint was within a gap in the genome sequence between 
BACs RPCI-11 433N2 and RPCI-11 44211. This positioned the 
breakpoint between the first and second exons of the PDE4B1 
form of the gene (acc. L20966) . A long-range PCR product 
FISH probe corresponding to the genomic region encompassing 
the la exons of PDE4B1 confirmed that the gene was 
disrupted between exon pairs la and exon 2 (i.e. only 
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PDE4B1 transcripts are directly disrupted by the chromosome 
abnormality) . 

Figure 33 shows an ideogram diagram of the chromosomal 
rearrangement (a reciprocal translocation) in patient 4. 
The two breakpoints are marked at the approximate 
chromosomal locations at which they are located. In 
addition, and not to scale, the two candidate disease- 
causing genes, PDE4B and CDH8 , are placed in the correct 
orientation and with respect to the breakpoints- The fusion 
g enes 0 n derived chromosomes 1 and 16 that result from the 
reciprocal translocation are also indicated, demonstrating 
the potential capacity for fusion transcript/protein 
synthesis. 

Figure 34 shows a representation of the genomic 
structure of the CDH8 gene: its spliced exons spread over 
a genomic extent of approximately 400kb. 

Above the gene, the coding contribution of each exon to the 
CDH8 protein is indicated by bars and finely dashed lines. 
The domain structure of CDH8 protein is shown at the top of 
the figure. 1 N ' and 1 C 1 refer to the N- and C-termini of 
the protein. The broken line at the N-terminus indicates 
the existence of signal peptide and proprotein domains - 
both of which are cleaved off in the mature protein. The 
• CD 1 ovals represent the positions of the five 
extracellular cadherin domains. The black box signifies the 
position of the hydrophobic stretch of amino acids that act 
as the membrane-spanning domain. The BAC clones used to 
identify the breakpoint location are included at the bottom 
of the figure together with the inferred direction (arrows) 
of the breakpoint from the FISH results using these clones. 
The heavy dashed line shows the position of the breakpoint 
with respect to the gene exons and the domain structure of 
the protein. 

Figure 35 Nucleic acid sequence of Human CDH8 . The 
following features are marked for clarity: 

a) ATG start site located at position 253 (underlined) 

b) GC bases (underlined) at the junction between exons 1 
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and 2 (i.e. between which the breakpoint is located) 

c) UGA stop codon located at position 2 650 (underlined) . 

Figure 36 Amino acid sequence of Human CDH8 with 
underlined regions of interest. 

a) Residues 1-29 signal peptide domain (italics) 

b) Residues 3 0-61 propeptide fragment cleaved off in mature 
protein. 

c) Residues 76-158 cadherin domain #1 (underlined) 

d) Residues 172-248 cadherin domain #2 (underlined) 

e) Residues 281-383 cadherin domain #3 (underlined) 

f ) Residues 396-487 cadherin domain #4 (underlined) 

g) Residues 500-597 cadherin domain #5 (underlined) 

h) 'V 1 highlighted at position 513 is the last residue in 
common with the putative truncated rat protein product from 
the alternatively spliced form. 

i) Residues 622-645 transmembrane domain #1 (underlined) . 

Figure 37 

a) Fusion protein product resulting from CDH8 promoter/exon 
1 spliced to PDE4B exon 2 and beyond (transcribed on 
der(16)). The underlined residues 'RV represent the fusion 
site between the two genes. 

b) Fusion protein product resulting from PDE4B promoter 
(long form) /exon la spliced to CDH8 exon 2 and beyond 
(transcribed on der(l)). See text for details: only the 
reading frame producing the N-terminal truncated form of 
the CDH8 protein is shown. The underlined 'go 1 at position 
68 represents the point of fusion between the two genes. 
Three potential methionine translation start sites are 
shown (highlighted) with the second of these having a 
nucleic acid sequence most similar to the canonical Kozak 
sequence (underlined). Use of this start site would 
generate a truncated CDH8 protein lacking the signal 
peptide, proprotein fragment, cadherin domain 1 and most of 
cadherin domain 2. 
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Materials and methods 

Lymphocyte extraction and metaphase chromosome preparation 

Lymphocytes were extracted from 7mls of patient blood 
(for storage and generation of EBV-transf ormed cell lines) 
using density gradient separation (Histopaque-1077 , Sigma). 
In order to generate metaphase-arrested chromosomes for 
cytogenetic analysis, 0.8rnls of patient blood were cultured 
for 71hrs in medium containing phytohaemagglutinin 
(Peripheral Blood Medium, Sigma) • The short-term cultures 
were treated with colcemid for one hour followed by a 
conventional fixing procedure. Fixed chromosomes were 
dropped onto microscope slides and stored for 1 week prior 
to use in FISH experiments. 

Selection of YAC clones for FISH probe synthesis 

YAC clones were selected from the Whitehead/MIT map of 
the relevant chromosome in the cytogenetic intervals within 
which the breakpoints were adjudged to lie. YACs were 
obtained from- the HGMP Resource Centre, Babraham 
Bioincubator , Babraham, Cambridge, UK 

(http://www.hgmp.mrc.ac.uk/). Clone DNA was prepared by 
standard methods and PCR amplified using primers designed 
against consensus sequence elements within the archetypal 
Alu repeat, Breen et al, 1992. This "Alu-PCR" gives a 
representative spread of non-repetitive sequence over the 
full length of the YAC and generates a better FISH probe 
than native YAC DNA. Alu-PCR was performed using the Expand 
Long Template PCR kit (Roche). Cycling conditions : 94°C - 
45s, 55°C - 30s, 68°C - 8min: 35 cycles. 68°C - lOmin final 
extension. 

Fluorescence in situ hybridisation (FISH) protocol 

Probe template DNA (pooled Alu-PCR products, BAC clone 
DNA, cosmid clone DNA or long-range PCR products) were 
labelled by nick translation and hybridised to patient 
metaphase spreads using standard FISH methods. Slides were 
counterstained with DAPI in Vectashield anti-fade solution 
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(Vector laboratories) . A Zeiss Axioskop fluorescence 
microscope with a chroma number 81000 or 830000 multi- 
spectral filter set was used to observe the chromosomal 
hybridisations. Images were captured using Vysis 
SmartCapture extension running within IP Lab spectrum or 
digital Scientific SmartCapture imaging software, FISH 
signals observed on derived chromosomes dictated the 
selection of further clones required to "walk" towards the 
breakpoint. Breakpoint-spanning FISH probes have signals on 
a normal chromosome and on both derived chromosomes. 

Resolution of breakpoint position 

BAC clones corresponding to positive YAC regions were 
arranged into contigs by consulting the Washington 
University FPC 

(http: //www. genome, wustl . edu/gsc/human/Mapping/index. shtm 
1) , UCSC GoldenPath Draft Human Genome Browser 
(http : / /genome . ucsc . edu/ goldenPath/hgTracks . html ) and 
Ensembl (http://www.ensembl.org/) databases. BAC clones 
were supplied by BACPAC Resources, Oakland, California, USA 
(http://www.chori.org/bacpac/). Clone selection was biased 
to gene-containing BACs. Once a breakpoint-spanning BAC was 
identified, the position of the breakpoint in relation to 
candidate gene exons was determined by FISH probes 
generated from chromosome-specific library cosmids (HGMP 
Resource centre) or precisely positioned, repeat element- 
free long-range PGR products (Expand long range PCR kit, 
Roche; see below for primer sequences) . Cycling 
conditions: 94°C - 45s, 52°C - 30s, 68°C - llmin: 35 cycles. 
68°C - 15min final extension. Cosmids were isolated by 
probing the appropriate chromosome-specific library filters 
(HGMP-RC) with isotopically labelled exon-specif ic PCR 
products. 



WO 03/087408 




PCT/GB03/01543 



29 

Example 1: Molecular characterisation of chromosomal 
disruption and identification of disrupted gene from 
patient 1 

FISH experiments on chromosome 3pl3 had narrowed the 
location of the breakpoint to a region including the large 
gene SEMCAP3 (approximately 250kb genomic extent) . Two BAG 
clones were selected from the tiling diagram of BAC clones 
placed on the human genome map backbone (June 2002 release 
of the 'BAC End Pairs 1 track on the UCSC Genome Browser; 
http; / /genome. cse.ucsc. edu/ index. html? or a=Human) . These 
were RPCI-11 606pl6 and RPCI-11 94j25. By FISH, these BAC 
clones flanked the breakpoint (the former translocated to 
the derived chromosome 8 and the latter remained on the 
derived chromosome 3) . The position of these two BAC clones 
indicated that the breakpoint lay within the large (200kb) 
intron between exons 3 and 4 of the SEMCAP3 gene (see 
Fig. 2). Thus, the inventors inferred from these results 
that the SEMCAP3 gene was directly disrupted by the 3pl3 
translocation event and, as such, is a candidate gene for 
the psychiatric disorder exhibited by the patient. 

Semcap3 (semaphorin cytoplasmic domain-associated 
protein) was originally identified in mouse as a gene 
encoding a protein that interacts with M-semF/Sema4c . Two 
forms, 3A and 3B, were submitted to the public nucleic acid 
sequence database (Wang & Strittmatter , 1999) but have yet 
to be published. It appears that 3b may be an artif actual 
sequence as it displays deletions in the sequence. Sema3a 
is identical in structure to the predicted human gene, 
KIAA1095 and the inventors refer to this sequence as human 
SEMCAP3 . The yeast two-hybrid screen that isolated Sema3a/b 
also identified Semal and Sema2 as genes encoding proteins 
which interact with the cytoplasmic tail of the SEMA4C 
protein (Wang et al., 1999). 

The purpose of these screening experiments was to 
elucidate cytoplasmic interactors with the transmembrane 
receptor, SEMA4C . This protein belongs to a large group of 
signalling proteins described as 1 semaphorins » . In the 
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brain, these proteins are thought to play important roles 
in brain development through their action on axonal 
guidance and growth cone stability. Inagaki et al., (1995) 
showed that Sema4C is expressed in the developing mouse 
brain. One proposed explanation for the origin of 
psychiatric disorders (including the disorder exhibited by 
the patient described here) is the incorrect development of 
the brain, particularly the connections, projections and 
neural networks between brain subregions. With this in 
mind, semaphorins, and the proteins that interact with them 
(such as the SEMCAPs) , become attractive candidate genes 
for the psychiatric disorders. 

It is suspected that the PDZ domains (see Fig. 2) of 
the SEMCAP3 protein will be involved in protein-protein 
interactions (such as SEMA4C interaction) as they are in 
other proteins. The RING-finger domain of SEMCAP3 
identifies it as belonging to a class of proteins known as 
ubiquitin ligases. Ubiquitin ligases specifically target 
proteins for ubiquitinat ion and subsequent destruction in 
the proteasome pathway. Thus, SEMCAP3 may act to regulate 
the activity of other proteins (for instance, components of 
the semaphorin pathway) by targeting them for destruction. 
The ZF-TRAF/SINA domain is most likely an extension of the 
RING-f inger domain . 

Figure 2 shows that the breakpoint would end SEMCAP3 
transcription after the third exon on the derived 
chromosome 3 (there would still be one normal chromosome 3 
and SEMCAP3 gene remaining in each nucleus) . If 
transcription occurs on the derived chromosome 3 then the 
resulting translated protein product would be truncated; 
lacking part of the first PDZ domain and all subsequent 
amino acids in the C-terminal direction. It remains to be 
investigated if the psychiatric disorder in this patient 
results from N33 perturbation on one allele, the disruption 
of SEMCAP3 on one allele, the generation of an aberrantly 
functioning truncated SEMCAP3 from one allele or a 
combination of these. 
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Pulver et al . (1995) detailed schizophrenia linkage to 
chromosome 3p (albeit telomeric to SEMCAP3 ) . However, two 
further studies have failed to replicate these findings in 
different populations (Maziade et al • , 2001 & Hovatta et 
al . , 1998) . 

Example 2: Further molecular characterisation of 
chromosomal disruption and identification of disrupted gene 

In this case, primers corresponding to N33 3'UTR 
sequences and an STS, SHGC-12093 (Acc. No. G17275) were 
designed (see below for primer sequences) . These PCR 
products were used to screen the chromosome 8 specific 
cosmid library (LA08) . Among others, positive cosmids 
LA0854-H5 (3« UTR) and LA08145-E3 (STS) were isolated and 
subsequently used in FISH experiments (see below for 
results) . 

3 1 UTR primers 

Primer A: TGCCACGTGTTAGCAGAAAG 
Primer B: TGCCTTTAACCAGATGAGGC 

SHGC-12093 primers 

Primer A: TCTTGTGGGTCACAATTAGGC 

Primer B: TAAAAAGGTGCAGTTTCTTCAGC 1 • 

The subject has schizoaffective disorder and a 
balanced reciprocal translocation between chromosomes 3 and 
8. A 8p22 breakpoint-crossing YAC, 931_a_l, was identified. 
This permitted a 8p22 breakpoint-crossing BAC RPCI-ll 23jl4 
(acc, no. AC019292) to be found. This was shown to contain 
the 3' end of the N33 gene (Fig. 6). Subsequently, FISH with 
cosmids LA0854-H5 and LA0814 5-E3 from the LANL chromosome 
8 specific library (HGMP Resource Centre, Babraham, 
Cambridge, UK) flanked the breakpoint, placing it 
approximately 100Kb from exon 11 of N33. N33 is related to 
a number of genes, human IAG2, Drosophila CG7830, C. 
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elegans g304348 and two yeast proteins, OST3 and OST6 (see 
Fig. 8 for alignment of proteins) - While the homologies 
between N33 and the yeast proteins are relatively weak, 
they .share conserved cysteine residues and have the same 
locations for the four transmembrane domains as predicted 
by hydropathy plots. Ost3 and Ost6 are components of the 
oligosaccharyl transferase complex responsible for the 
addition of oligosaccharides to selected proteins. This has 
been backed up by protein structure prediction programs 
detailed in a recent report Fetrow et al, 2001. 

The present inventors have identified an alternative 
start exon, herein identified as exon la (see Figures 5 & 
6) to that in the public database, herein identified at 
exon lb. Additionally they have identified a. complex 
variation of splicing with the exons and proposed sequences 
of the transcripts, shown in Figures 5, 6 and 37 
respectively. In view of the complex splice variations the 
C-terminal sequence of the various N3 splice forms is 
predicted to vary and this is shown in Figure 9. 

Because N33 lies within a linkage hotspot for 
schizophrenia (Gurling et al, 2001, Brzustowicz et al, 
1999, Blouin et al, 1998, Kaufmann et al, 1998, Kendler et 
al, 1996, Pulver et al, 1995) the present inventors decided 
to carry out an association study on this gene. Three 
microsatellite markers (D8S549, N33 microsatellite and 
D8S1992 

Microsatellites used in associated study 
D8S549 

Primer A: AAATGAATCTCTGATTAGCCAAC 
Primer B: TGAGAGCCAACCTATTTCTACC 

N3 3 microsatellite 

Primer A: AGGCTGAGTGCCAAAAAGTA 

Primer B: CTTTAAGCTTGCTATTTGAAGGC 



D8S1992 

Primer A: TTCATCGTCTGAACCTGG 
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Primer B: ACACATTTCCTCTATGTTGC ) were chosen and used to 
type 25 mother-father-schizophrenic proband trios and 64 
schizophrenic cases and 64 normal controls. The haplotypes 
derived from the trio study were examined for frequency 
bias in the case and control samples. Certain haplotypes 
are currently over-represented in the schizophrenic case 
genotypes compared to controls. Appropriate individuals 
with the haplotypes are currently being screened for 
mutations . 

Example 3: Molecular characterisation of chromosomal 
disruption and identification of disrupted gene from 
patient 2 

Psychiatric evaluation 

The subject (female) was approached and gave full, 
informed written consent for this study as one of a large 
cohort of people co-morbid for schizophrenia and mental 
retardation. Prior to investigation she was not known to 
have any abnormality of karyotype. She suffered from 
chronic schizophrenia and a mild degree of mental 
retardation (IQ between 65-70) . The diagnosis of chronic 
schizophrenia was confirmed using SADS-L structured 
interview to generate DSM-IV and ICD-10 criteria, by a 
psychiatrist experienced in both general psychiatry and the 
psychiatry of mental retardation (WM) . SADS can be 
reliably used in patients with mild mental retardation. 
Consensus diagnosis was reached on review by two 
psychiatrists (WM and DB) . IQ scores were generated from 
WAIS-R and their stability shown by similar levels detected 
by psychological examination at different times throughout 
her life. There were no dysmorphic features in the 
subject. However the subject did suffer from bilateral 
deafness since childhood - a consequence of surgical 
operations on the mastoids. There was no family history of 
mental illness or mental retardation that could be 
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ascertained. Other members of the family declined to 
participate in the study. 

An initial G-banded karyotype of this patient 
indicated that the chromosome abnormality was complex (46, 
XX,ins(8;ll) (ql3;q23.3q24.2)inv(2) ( p 1 2 q 3 2 . 1 ) 
t ( 2 ; 1 1 ) (q21.3;q24.2)der(2) (2qter->2q32.1::2pl2- 
>2q21.3: : llq24 . 2->iiqter ) der(ii) ( llpter->llq23 .3: : 2q21 . 2- 
>2q32 . 1 : : 2pl2->2pter) der (8) ( 8pter->8ql3 : :llq23,3- 
>llq24 . 2 : : 8ql3->8qter) ) , involving a pericentric inversion 
of chromosome 2 coupled with rearrangements involving 
chromosomes 2, 8 and 11 (Fig. 13). Figure 12 details the 
YAC and BAC FISH probes crossing or bracketing breakpoints 
on 2 and 11. Sequence in the locality of the breakpoints 
was assessed for gene content. 

PCR primers 

Long-range PCR for FISH probe templates: 
Int2-3 GRIK4a; CAGGAGGTCCTGTGAAGCTC, 
Int2-3 GRIK4b; ACAGGGAAAGAAGCAAAGCA . 

GRIK4 exon region-specific PCR: screening of chromosome 11 
cosmid libraries: 



Exla/a' 




• AAAGCTAAGCGCAGGTGTGT , 


Exla/a ' 


b; 


. TTTCTGGGAGGCAACCATAG , 


Exlb 




r GCAGAGTTATGTCATGCCCA , 


Exlb 


b. 


; CCTGTGCAGCACTCTG ATGT , 


EX2/3 


a i 


} TTGAACCCAAGAGAACAGGG , 


EX2/3 


b. 


; TCCCCTTCTCCTTCCAGTTT 



Cycling conditions : 94°C - 2min initial denaturation. 94°C - 
lmin, 52°C - lmin, 72°C - 75s: 33 cycles. 72°C - 15min final 
extension . 

The llq23.3 breakpoint is located at a locus 
containing a kainate-type ionotropic glutamate receptor 
(GRIK4, acc. S67803 & NM_014619 (11), previous nomenclature 
KA1/EAA1) . Cosmid FISH directed at the individual exons 
and an intron-specif ic long-range PCR product FISH (Fig. 15) 
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positioned the breakpoint within the GRIK4 gene sequence; 
most likely immediately upstream of exon 2 (our 
nomenclature, Fig. 15). This was confirmed using a long- 
range PCR product FISH probe corresponding to the intron 
between exons 2 and 3 (Fig. 15). We also identified a 
GenBank EST (acc. BE388730, IMAGE clone ID:3613199) 
generating an alternative start-site resulting in an 
alternative cognate N-tern\inal peptide sequence (Figures 16 
and 17) . The position of a breakpoint anywhere between 
exonsla/a'/lb and exon 3 would truncate all putative 
transcript forms such that no receptor function could be 
encoded on the derived chromosome 11. Hence, the patient 
had only one intact GRIK4 allele. 

Discussion 

The present inventors identified a subject with 
comorbin schizophrenia with mild learning disability in 
whom chromosome translocation events have disrupted brain- 
expressed gene that are also functional disease candidates. 
Without wishing to be bound by theory it is hypothesised 
that the disruption of the GRIK4 gene by a chromosomal 
breakpoint (and the resulting reduced gene dosage) is the 
principal underlying cause of psychiatric disease in this 
patient . 

The gene disrupted in this patient is both expressed 
in the brain and participates in key physiological 
processes in the CNS. Notably, the gene may be involved in 
the alteration of the strength of synaptic/neural 
transmission, a phenomenon known as long-term potentiation 
(LTP) . LTP is postulated to underlie cognitive functions 
such as learning and memory. Moreover, cognitive testing 
has previously established that these functions are 
frequently affected in patients with schizophrenia. 
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GRIK4 

Three classes of ionotropic glutamate receptors have 
been identified on the basis of their pharmacological 
profiles and sequence homologies; NMDA receptors, AMPA 
receptors and Kainate receptors • Functional Kainate 
receptors in vivo may be heteromeric, consisting of 
combinations of the low kainate agonist affinity ( GLUR5 , 
GLUR6 and GLUR7) and high-affinity subunits (GRIK4 and 
GRIK5) (Chittajallu et al, 1999; Lerma et al, 2001 and 
Werner et al, 1991) • The subject with comorbid 

schizophrenia and mild learning disability possesses a 
complex chromosomal rearrangement. Of all the breakpoints 
studied in this patient only the GRIK4 gene is directly 
disrupted. This might be expected to modify kainate 
receptor channel properties by altering subunit 
stoichiometry . 

The glutamate receptors are key initiators of synaptic 
LTP (Miller and Mayford, 1999) . NMDA receptors are the 
principal mediators of LTP but recently presynaptic kainate 
receptor-dependent plasticity changes have been observed at 
mossy fibre synapses in the hippocampus (Contractor et al, 
2001 and Lauri et al f 2001). Interestingly, an involvement 
of the glutamate neurotransmitter system in the 
pathophysiology of schizophrenia has been postulated. The 
"Glutamate Hypothesis 11 attempts to explain the psychotic 
symptoms that arise following administration of ionotropic 
glutamate receptor antagonists such as phencyclidine (PCP; 
"Angel Dust") and ketamine (Goff and Nine, 1997). Several 
studies also point to changes, predominantly decreases, in 
glutamate receptor subunit expression (including kainate 
receptors) in the brains of schizophrenic patients (Ibrahim 
et al, and Meador-Woodruf f # 2001). Similarly, Mohn et al, 
1999 report that mutant mice with reduced NMDAR1 (another 
glutamate receptor) expression levels display 
schizophrenia-like behaviours. 
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As well as aberrant neurotransmission function in the 
adult, it has been suggested that neurodevelopmental 
deficits may contribute to schizophrenia. Neuroanatomical 
studies indicate statistically significant reduced volumes 
of brain regions, primarily the hippocampus, in 
schizophrenic and comorbid patients (Sanderson et al 1999 
and Pearlson, 1999) . GRIK4 is expressed in the amygdala, 
hippocampal formation (CA3 pyramidal and dentate granule 
cells) and entorhinal cortex. Glutamate receptors might 
mediate brain development through the activity-dependent 
refinement of neuronal connections. 

The present subject was clinically diagnosed as having 
schizophrenia coupled with mild learning disability. It 
may be the case that causative gene mutations in comorbid 
patients lead to a severe phenotype or have more profound 
downstream effects than gene mutations in patients with 
schizophrenia alone (i.e. the comorbid state represents the 
severest form of schizophrenia (Doody et al, 1998)). A 
second possibility is that the gene mutation gives rise to 
the learning disability component of the illness through an 
independent effect on brain development. The manner in 
which the mutated genotype gives rise to the observed 
phenotype (via functional or developmental mechanisms) is 
a key issue in molecular neurobiology, particularly in the 
characterisation of mouse "knockout" mutants (Mayford et 
al, 1995). 

A large number of publications detail family and 
population-based linkage studies carried out to identify 
psychiatric illness susceptibility loci. The results have 
not been conclusive perhaps indicating the presence of 
confounding factors such as population stratification, 
incomplete penetrance, genetic heterogeneity and uncertain 
mode of inheritance. Nevertheless, GRIK4 lies at. the edge 
of a schizophrenia linkage region described in a recent 
publication (Gurling et al, 2001). The most centromeric 
marker exhibiting linkage to schizophrenia in this paper, 
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D11S925, is located within an intron at the 3' end of GRIK4 . 

Example 4: Molecular characterisation of chromosomal 
disruption and identification of disrupted gene from 
patient 3 

Pine FISH mapping of the breakpoint with cosinid clones 

PCR products corresponding to regions in or near 
hNPAS3 exons 4, 5 and 6 were obtained using the following 
primers under standard PCR conditions (Exon 4-i 
ACAACCATTCTGGGAACAGC , Exon 4-ii GTGTAGGGAAAGCCATCCAA , Exon 
5-i TCTTTTTCCTGCAGTCCCTG , Exon 5-ii CTCCAAATG ACTCCTGCCAT , 
Exon 6-i GCCTCTGCC ATAG ATTTTGC , Exon 6-ii 

TTCCTTCCCACCCTTTCTCT) . Probes were created by random- 
primed labelling of PCR products with radioactive dCTP; 
these were used to screen a LANL chromosome 14-specific 
cosmid library (LA14NC01 obtained from the UK HGMP Resource 
Centre, Hinxton, Cambridge) using hybriding conditions set 
out in Church and Gilbert (1986) . Positive clones (exon; 
LA1431-G5, exon 5: LA14123 - C4 and exon 6; LA1487 - D9) 
were prepared by a standard alkaline lysis protocol and 
taken through FISH analysis as above. 

Results 

Metaphase spreads from EBV-transf ormed cell lines were 
analysed by Fluorescence in situ Hybridisation (FISH) using 
successively smaller DNA probes. A breakpoint spanning BAC 
clone was obtained by FISH screening (RPCI-11 BAC 1078il4, 
acc. no. AL161851) . EST sequences were examined in the 
genomic DNA flanking the breakpoint in order to identify 
potential transcripts in the locality. A number of ESTs 
were identified which had been annotated as containing 
homologous sequence to the conserved "PAS" domain present 
in a large number of genes (Gu et al, 2000). A search of 
such genes revealed that the most closely related gene 
encoded a mouse brain-expressed transcript, neuronal pas 
domain protein 3 (NPAS3 (M0P6) , acc. no, AF137871; 
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hereafter referred to as mNPAS3) . Nucleotide homology to 
the mNPAS3 cDNA within human genomic DNA BAC clone 
sequences at 14ql3 using the BLAST algorithm identified 12 
exons corresponding to the human orthologue of mNPAS3 
(hNPAS3) distributed over a genomic region of approximately 
800-900Kb making it among the largest gene loci in the 
human genome (Figure 23) . Subsequently, full length hNPAS3 
cDNA sequences have been submitted by two other groups to 
GenBank/ EMBL with the accession numbers, AB054575 and 
AF164438, although these have differences to the mouse 
splice-form in the 5 1 exons. This is due to the presence 
of two alternative transcription start sites employed in 
both human and mouse genes. This was confirmed by analysis 
of published cDNA and EST sequences coupled with further 
sequencing of corresponding IMAGE clones. These splice 
variants are highlighted in Figures 18, 30 and 23. 

The ratio of fluorescent signals on the derived 
chromosomes 9 and 14 from the breakpoint-spanning BAC 
probe, 1078il4, indicated that the breakpoint was located 
at the centromeric end of the BAC. This is the location of 
exon 5 of the gene. Exon 4-, 5- and 6-containing cosmids 
were isolated and used as FISH probes to provide definitive 
proof of the location of the breakpoint and confirmation 
that a full-length transcript (and hence protein) cannot be 
synthesized on the derived chromosome 14. An exon 5- 
containing cosmid (see Figure 23) spanned the breakpoint. 
Subsequently a long-range PCR product-derived FISH probe 
corresponding to exon 5 indicated that the breakpoint lay 
upstream of exon 5. 

Long-range PCR primers - NPAS3 exon 5 

a ) ccagcttgtatgtggtgtgg 

b) ttactcccagtgcccattgt . 
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Discussion 

A FlSH-based approach has shown that the gene, NPAS3 , 
is disrupted by a chromosomal rearrangement present in a 
mother and daughter who suffer from comorbid schizophrenia 
and learning disability respectively. NPAS3 is a brain 
expressed transcription factor of the basic helix-loop- 
helix PAS domain class which includes members such as AHR 
and ARNT . 

Neuronal pas3 (NPAS3) was originally cloned in the 
mouse (Brunskill et al, 1999) on the basis of its sequence 
homology with other PAS domain proteins. Its expression 
has been characterised in the developing mouse embryo where 
high levels are seen in the neural tube, neuroepithelium 
and, later, the neopallial layer of the cortex. Non-neural 
expression was also observed in the heart, limb and kidney. 
In the mouse, NPAS1 (human chromosomal location, 19ql3) is 
expressed in deep pyramidal cortex cells, hippocampus and 
amygdala (Zhou et al,, 1997). NPAS2 (human chromosomal 
location, 2ql3) is expressed in the cortex, hippocampus and 
thalamus. Lower levels were also seen in spinal cord, 
intestines and uterus. NPAS2 was also recently deleted in 
mice by homologous recombination (Garcia et al., 2000) 
leading to deficits in cued and contextual memory. In 
addition NPAS2 appears to have a role in cellular energy 
state monitoring and the circadian rhythm pathway (Reick et 
al, 2001 and Rutter et al, 2001). The translocation event 
described herein disrupts the gene between exons 4 and 5. 
If transcription occurred at this disrupted locus, a 
truncated protein would result containing only the bHLH 
domain. It is conceivable that this protein would have a 
dominant negative effect on wild-type NPAS3 protein (or any 
other heterodimeric protein partner) through the creation 
of non-functional dimers (see Figure 24 for explanatory 
diagram) . This would result in a potentially more severe 
or penetrant phenotype than a conventional point mutation. 
Two examples where bHLH -PAS proteins have been altered 
through loss of the C-terminal PAS domain (one 
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experimentally, the other in a patient with a chromosome 
translocation) have resulted in probable dominant negative 
action (Maemura et al, 1999, Holder jr. et al, 2000). 

Mutations in this gene in karyotypically normal 
individuals would not be expected to have as severe or 
penetrant effects as those observed in the two t(9;14) 
patients . 

Sequence comparison between hNPAS3 and other members 
of the NPAS sub-family show that homologies are largely 
restricted to the N-terminal end of the protein; the 
location of a basic helix-loop-helix and PAS domains. The 
greatest homology is with NPAS1, then NPAS 2 and other PAS 
domain-containing proteins (data not shown) . An alignment 
of the cognate human (conceptually translated from the 
splice-form containing exons 1-12) and mouse NPAS3 proteins 
reveals near identity over the N-terminal half of the 
protein but increased divergence at the C-terminal end. 
This is particularly the case for two stretches where 5 and 
7 amino acids , respectively, have been gained in the human 
orthologue (Fig. 21). These correspond to two poly-glycine 
tracts present within exon 12 (of 11 and 10 residues 
respectively) . Such tracts can be indicative of slipped 
strand mispairing whereby trinucleotide repeats are 
aberrantly expanded or deleted. Where they occur in coding 
sequence, increases in the number of trinucleotide repeats 
can have a pathological effect on protein function (e.g. 
Huntington disease and Spino-cerebellar ataxia 1) . Another 
feature of such repeats is their unstable nature between 
generations: a lowering of the age of onset of a disease 
from generation to generation (anticipation) can often be 
directly linked to an increase in the number of repeat 
units. 

Exon 12 (coding for the C-terminus of the protein) is 
also noteworthy because of the extremely high density of 
CpG dinucleotides (in humans and mouse); a feature that 
abruptly ends at the junctions with flanking intronic/3' 
sequences. This "CpG island" is unusual because it is both 
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transcribed and also located at the 3 1 rather than 5' end 
of the gene. The significance of this in terms of 
potential transcriptional control by methylation or 
susceptibility to mutation is as yet unknown. However, the 
high level of G and C bases creates a bias in amino acid 
composition such that alanine, glycine, histidine and 
proline are over-represented. This may explain the 
presence and expansion of the poly-glycine tracts in Npas3. 

14ql3 is also the site of linkage to Fahr 1 s syndrome 
(idiopathic basal ganglia calcification; IBGC) as 
determined from analysis of families (Geschwind et al, 
1999). Fahr's syndrome symptoms are often accompanied by 
psychoses such as schizophrenia. Thus, it may be the case 
that NPAS3 is also the gene responsible for Fahr's 
syndrome . 

Example 5: Molecular characterisation of chromosomal 
disruption and identification of disrupted gene from 
patient 4 

Psychiatric evaluation 

The subject (male) is the proband in a family 
segregating a t(l;16) balanced reciprocal translocation. 
He gave full informed consent to the study. His diagnosis 
of chronic schizophrenia was confirmed by SADS-L structured 
interview and a consensus reached by two psychiatrists (WM 
and DB) . He does not have mental retardation. Other 
members of his near family also gave consent to participate 
in this study, none of whom had current mental illness 
(several are below the age of risk for psychiatric 
illness) . There was also a history of mental illness 
(major depressive disorder) in members of the extended 
family who were known to be translocation carriers, but 
they could not be approached for confirmation at the time 
of the current study. An unrelated individual (now 
deceased) with DSM-IV chronic schizophrenia without 
learning disability also had a t(l;16) balanced 
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translocation with the same breakpoints (at the resolution 
of G-banding) . 

PCR primers 

Long-range PCR for FISH probe templates: 
PDE4B3a ; GTCAGACAAATCCAAATGGAGAG , PDE4B3b ; 

CTTTCTCCTGTCACTTTCCTTCA ♦ 

Cycling conditions: 94°C - 2min initial denaturation. 94°C - 
lmin, 52°C - lmin, 72°C - 75s: 33 cycles. 72°C - 15min final 
extension. 

The balanced translocation, t ( 1 ; 16) (p3 1 . 2 ;q21) , in 
this family results in two breakpoints (Figure 33). 
Genomic sequence at 16q21 is not complete. The only known 
gene in the vicinity of the breakpoint region is Cadherin 
8 ( CDH8 , acc. AB035305) . 

In contrast, on chromosome lp31.2 FISH identified two 
non-overlapping BAC clones (RP11-433N2, acc. AL513493 and 
RP11-442I1, acc. AL391359) which reside on either side of 
the breakpoint in this patient. The breakpoint-containing 
genomic region between, these two BAC clones has yet to be 
sequenced (see Figure 32) . Database annotation of the two 
BAC clones together with BLAST mapping of exons onto 
genomic sequence indicated that this locus contains a cAMP 
phosphodiesterase gene, PDE4B . Two cDNAs corresponding to 
longer transcript forms of this gene (denoted PDE4B1 , acc. 
L20966 and PDE4B3 , acc. U85048, respectively) have been 
previously characterised (Bolger et al, 1994; Huston et al, 
1997) . Long-range PCR product FISH (Figure 32) confirmed 
that the PDE4B1 transcript is directly disrupted by the 
breakpoint (although additional position-effect 
perturbation of PDE4B3 expression cannot be ruled out) . 
Huston et al. (1997) have previously shown that the PDE4B1 
transcript encodes an alternative N-terminal peptide 
sequence. In addition, they demonstrated that only this 
form is expressed in the brain. It is therefore predicted 
that this patient will have a reduction in the levels of 
functional PDE4B in the brain. 
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Discussion 

The present inventors have identified a subject with 
DSMIV chronic schizophrenia in whom chromosome 
translocation events have disrupted brain-expressed genes 
that are also functional disease candidates. Without 
wishing to be bound by theory it is hypothesised that the 
disruption of the PDE4B gene by a chromosomal breakpoint 
(and the resulting reduced gene dosage) is the principal 
underlying cause of psychiatric disease in this patient. 

The gene disrupted in this patient is both expressed 
in the brain and participates in key physiological 
processes in the CNS. Notably, the gene may be involved in 
the alteration of the strength of synaptic/ neural 
transmission, a phenomenon known as long-term potentiation 
(LTP) . LTP is postulated to underlie cognitive functions 
such as learning and memory. Moreover, cognitive testing 
has previously established that these functions are 
frequently affected in patients with schizophrenia. 

PDE4B 

Stimulation of the G protein coupled 
receptor /heterotr imer ic G protein pathway results in the 
synthesis of the secondary messenger, cAMP, by members of 
the adenylyl cyclase family of enzymes. This secondary 
messenger triggers a well-characterised signalling cascade 
that is principally mediated by cAMP-dependent protein 
kinase A (PKA) and cAMP-resposive transcription factor, 
CREB, both of which have been implicated in the molecular 
pathways of LTP (Abel & Latal, 2001). cAMP signalling is 
attenuated by its breakdown by members of the 
phosphodiesterase enzyme family. Four members of the PDE4 
sub-family of cAMP phosphodiesterases have been identified 
to date (PDE4A-PDE4D) . These four genes are the human 
homologues of the Drosophila learning and memory mutant 
gene, Dunce. The long form of the PDE4B protein, PDE4B1, 
is the only splice form with brain expression and the 
present inventors have shown that it is disrupted in the 
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subject. Anti-PDE4B antibodies revealed expression within 
the inferior olive, the hypothalamus, the ventral striatum, 
the cerebellar molecular layer, globus pallidus, nucleus 
accumbens and substantia nigra (Cherry & Davis, 1999) . The 
authors of this expression study suggested that PDE4B 
expression strongly correlates with brain areas underlying 
reward and affect in mammals. In addition, PDE4 proteins 
are recognised as the molecular targets for Rolipram, a 
drug with anti-depressant effects. Rolipram inhibition of 
PDE4 activity has been shown to improve long-term 
hippocampal LTP and spatial memory in mice (Barad et al, 
1998 and Bach et al, 1991) . The (heterozygous) disruption 
to PDE4B1 described here may be eguivalent to 50% reduction 
of protein product in the brain. This could result in a 
greater cAMP half-life and a concomitant increase in the 
activation of downstream cAMP targets. 

In addition, the disruption to PDE4B shows reduced 
penetrance as not all translocation carriers present with 
psychiatric illness (although all members of the extended 
family with psychiatric illness possess the translocation 
karyotype; data not shown) . 

Example 6: Molecular characterisation of chromosomal 
disruption and identification of disrupted gene from 
patient 4 

FISH experiments on chromosome 16q21 had narrowed the 
location of the breakpoint to a region including the large 
gene CDH8 (approximately 400kb genomic extent) . Three BAC 
clones were selected from the tiling diagram of BAC clones 
placed on the human genome map backbone (June 2002 release 
of the • BAC End Pairs' track on the UCSC Genome Browser; 
http: / /genome, cse.ucsc. edu/ index. html?ora=Human) . These 
were RPCI-11 599cll, RPCI-11 875el2 and RPCI-11 685m21. By 
FISH, these BAC clones flanked the breakpoint (the first 
two translocated the derived chromosome 1 whereas the third 
remained on the derived chromosome 16) . The position of 
these three BAC clones indicated that the breakpoint lay 
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within the large (lOOkb) intron between exons 1 and 2 of 
the CDH8 gene (see Fig, 2). Thus, the inventors inferred 
from these results that the CDH8 gene was directly 
disrupted by the 16q21 translocation event and, as such, is 
a candidate gene for the psychiatric disorder exhibited by 
the patient. The similar disruption of the PDE4B gene on 
chromosome 1 and their relative orientations on the two 
chromosomes raised the possibility that the derived 
chromosomes (the two chromosomes resulting from the 
translocation: der(l) and der(16)) could transcribe 
fusion/hybrid genes. This has been frequently seen in cases 
where a translocation gives rise to susceptibilty to 
cancers. In essence, the translocation in the proband 
resulted in an exchange of the two genes' promoter and 
first exon sequences. On the der(l) the promoter and first 
exon of the CDH8 gene are juxtaposed to exon 2 and 
downstream of the PDE4B gene (see Fig. 33). However, the 
reading frames of these two gene segments are not the same, 
resulting in a prematurely truncated peptide with only the 
signal peptide, proprotein fragment and a small portion of 
the cadherin domain contained within (see Fig. 37a) . This 
would be expressed in the same cell types/tissues as the 
normal CDH8 gene but the functional/pathological 
significance of this small peptide is not clear at the 
current time. On the der(16) the PDE4B promoter and exon la 
are juxtaposed to exon 2 and downstream of the CDH8 gene 
(see Fig. 33). Exon la of PDE4B does not contain a 
translation start-site so the reading frame compatabilities 
of the putative fused transcript are not an issue. However, 
exon 2 and downstream of the CDH8 gene contain several ATG 
start-sites which could be employed by translational 
machinery to generate peptide sequences. In two of the 
reading frames, any generated peptides would be small and 
probably of no consequence. The third reading frame (the 
normal CDH8 reading frame, see Fig. 5b) contains three ATG 
start-sites early on, with the second of these forming a 
very good match to the canonical Kozak sequence found at 
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most translation start-sites (CCAxxATGG) . If this one is 
used then the resulting peptide will be identical to normal 
CDH8 protein but lacking the N-terminal portion encoding 
the signal peptide, proproteih fragment, the first cadherin 
domain and most of the second cadherin domain* Although the 
bulk of the peptide sequence is as the normal CDH8 protein, 
the lack of the N-terminal sequences may prevent the 
protein from entering the Golgi/ER subcellular compartments 
- a process that is required for the correct insertion 
in/trafficking to the cell membrane. The 
functional/pathological consequence of the presence of this 
truncated form of the CDH8 protein in the cytoplasm of 
tissues where the long form of the PDE4B gene is expressed 
is uncertain at this point. 

In summary, the psychiatric illness seen in the 
proband, and other members of the family, may be the result 
of one (or a combination) of the following circumstances: 
the loss (through disruption) of one allele of PDE4B , the 
loss (through disruption) of one allele of CDH8 or the 
generation of potentially pathological fusion polypeptides. 

Cadherin-8 was first cloned in humans (Tanihara et 
al., 1994) and later in mouse (Munro et a., 1996) and rat 
(Kido et al., 1998). Sequence analysis immediately placed 
the gene product within the large family of membrane- 
spanning proteins with extracellular cadherin domains 
thought to mediate calcium-dependent homophilic 
interactions between adjacent cells. As such, the cadherins 
are members of the functionally defined group of cell 
adhesion proteins. 

CDH8 is a member of the Type II, or atypical, 
cadherins which are defined by the lack of an extracellular 
tripeptide motif, HAV, possibly involved in the binding 
specificity of Type I cadherins. Fig. 2 illustrates the 
structure of CDH8 protein which includes an extracellular 
domain containing 5 copies of the cadherin domain, a 
membrane spanning domain and a C-terminal cytoplasmic tail. 
The cytoplasmic tail is thought to signal the presence of 
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interactions to the intracellular compartment by mediating 
receptor clustering through interaction with the proteins 
such as 3-catenin, a-catenin and, eventually, the 
cytoskeletal proteins, actin and a-actinin. In this way, 
adhesion to adjacent cells can affect the cytoarchitecture 
of the cell and may even play a role in cell motility. 

The two principal roles of neuronal cadherins are 
thought to be in the mediation of certain developmental 
pathways in the brain and the regulation of synaptic 
function. The homophilic nature of cadherin interaction 
(i.e. CDH8 proteins preferentially bind to other CDH8 
proteins) has prompted the hypothesis that cadherins are 
responsible for the aggregation or interconnection of 
similar cells within an organ. This has been shown to be 
the case in the brain where CDH8 expression has been shown 
to be restricted to particular subregions and even neuronal 
patches (Redies, Bishop, Rubenstein, Korematsu X 2) . 

The major cadherin in the brain, N-cadherin (encoded 
by CDH2) , has been implicated in synaptic long-term 
potentiation (LTP) : the mechanism thought to underlie 
learning and memory on the* brain (e.g. Huntley et al., 
2002 & Bozdagi et al., 2000,). Other cadherins may also 
play a part in this process (Uemura, 1998 & Tang et al., 
1998) . In essence, cadherins seem to form physical bridges 
across the synaptic cleft which may modify synaptic 
efficacy and/or spine morphology (two features of neurons 
demonstrated to change after the induction of LTP) . 

Interestingly, two of the hypotheses used to explain 
the origins of psychiatric illness are, firstly, the 
occurrence of abnormal brain development and, secondly, the 
existence of deficits in cellular pathways manifested as 
poor performance in certain cognitive/memory tasks. The two 
roles of neuronal cadherins seem to closely mirror these 
two hypotheses suggesting that CDH8 is a good functional 
candidate for psychiatric illness. 
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CLAIMS 

1. Use of a polynucleotide fragment or fragments 
comprising SEMCAP3 , N33, GRIK4 , NPAS3 , PDE4B and/or CDH8 
gene(s) or a fragment ( s) , derivative (s) or homologue (s) 
thereof for the manufacture of a medicament for treating 
schizophrenia and/or affective psychosis in a subject. 

2. Use according to claim 1 wherein the SEMCAP3 
nucleotide fragment comprises the sequence found in the 
public database under accession number AF127084 - AF127088, 
KIAA1095, AB029018, XM_041363 or BC014432 or the sequence 
shown in Figure 3 . 

3. Use according to either of claims 1 or 2 wherein 
the N33 polynucleotide fragment comprises the sequence 
found in the public database under accession number U42349 
or BAC RP11-23;14 or the sequences shown in Figures 6 or 7 . 

4. Use according to any preceding claim wherein the 
GRIK4 polynucleotide fragment comprises the sequence found 
in the public database under accession number NM_014619 or 
the sequences shown in Figures 10 or 16. 

5. Use according to any preceding claim wherein the 
NPAS3 polynucleotide fragment comprises the sequence found 
in the public database under accession number AB054 575 or 
AF164438 or the sequences shown in Figures 18 or 20. 

6. Use according to any preceding claim wherein. the 
PDE4B comprises the sequence as shown in Figures 25, 27 or 
29. 

7. Use according to any preceding claim wherein the 
CDH8 polynucleotide comprises the sequence found in the 
public database under accession number L34060, AB035305, 
NM_001796, AB010436, AB010437, BAC CTC-420A11 or AC040161 
or as shown in Figure 35. 
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8. Use of a polypeptide fragment or fragments 
comprising SEMCAP3 , N33 , GRIK4 , NPAS3, PDE4B and/or CDH8 
gene(s) or a fragment(s), derivative (s) or homologue(s) 
thereof for the manufacture of a medicament for treating 
schizophrenia and/or affective psychosis in a subject. 

9. Use according to claim S wherein the S EM CAP 3 
polypeptide fragment comprises the sequence found in the 
public database under accession number AAF22131, AAF22132 
or XP_041363, or as shown in Figure 4. 

10. Use according to either of claims 8 or 9 wherein 
the N33 polypeptide fragment comprises the sequence found 
in the public database under accession number Q13454 or as 
shown in Figures 6 or 7 . 

11. Use according to any one of claims 8 to 10 wherein 
the GRIK4 polypeptide fragment comprises the sequence found 
in the public database under accession number NM_014619, or 
as shown in Figures 11 and 17. 

12. Use according to any one of claims 8 to 11 
wherein the PDE4B polypeptide fragment comprises the 
sequence as shown in Figures 26, 28 or 30. 

13. Use according to any one of claims 8 to 12 
wherein the CDH8 polypeptide fragment comprises the 
sequence found in the public database under accession 
number NP_001787 or as shown in Figure 36. 

14. Use according to any preceding claim wherein the 
polynucleotide fragment or polypeptide fragment consists 
essentially of the identified sequences. 



15. A method of diagnosing schizophrenia and/or 
affective psychosis or susceptibility to schizophrenia 
and/or affective psychosis in an individual, wherein the 
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method comprises determining if SEMCAP3 , N33 , GRIK4 , NPAS3 , 
PDE4B and/or CDH8 gene(s) in the individual has/have been 
disrupted by a mutation or chromosomal rearrangement. 

16. The method according to claim 15 wherein any 
disruption is determined by detecting a relative level of 
mRNA expressed by the/ said SEMCAF3 , N33, GRIK4 , NPAS3 , 
PDE4B and/ or CDH8 gene(s) . 

17. The method according to claim 15 wherein a level 
of the/said SEMCAP3 , N33 , GRIK4 , NPAS3 , PDE4B and/or CDH8 
gene products are detected by an immunological technique. 

18. The method according to claim 17 wherein an 
antibody or antibodies specific for the/said gene(s) is 
used to detect said gene product(s). 

19. Use of an antibody or antibodies specific for 
SEMCAP3, N33, GRIK4 , NPAS3 , PDE4B and/or CDH8 for diagnosis 
of schizophrenia and/or affective psychosis. 

20. Use of an antibody or antibodies specific for 
SEMCAP3, N33, GRIK4 , NPAS2 , PDE4B and/or CDH8 for the 
manufacture of a medicament for the treatment of 
schizophrenia and/or affective psychosis. 

21. An animal model for psychiatric disorders wherein 
the animal model has been generated by specifically 
disruption expression of the/said SEMCAP3 , N33 , GRIK4 , 
NPAS2 , PDE4B and/or CDH8 gene(s). 

22. An animal model for psychiatric disorder wherein 
the animal model has been generated by specifically 
upregulating expression of the/ said SEMCAP3 , N33 , GRIK4 , 
NPAS2 , PDE4B and/or CDH8 gene(s). 
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23. A method for identifying ligands f or .SEMCAP3 , 
N33, GRIK4 , NPAS2 , PDE4B and/or CDH8 gene(s) products/ said 
method comprising the steps of: 

a) introducing into a suitable host cell a 
polynucleotide fragment according to the invention; 

b) culturing cells under conditions to allow 
expression of the polynucleotide fragment; 

c) optionally isolating the expression product; 

d) brining the expression product (or the host cell 
from step b) ) into contact with potential ligands which 
will possibly bind to the protein encoded by said 
polynucleotide fragment from step a) ; 

e) establishing whether a ligand has bound to the 
expressed protein; and 

f) optionally isolating and identifying the ligand. 
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1 AAAACTTCCC CGGGTAGATT CACCCACCGG TCCTGGAAAC CTGCTAAATC CTGAAGGTTC 
61 ACAGAACCTC TGGTCAGAAC TGAAGTTGCA GCCGGAGCTT CCCGCAGGCT CTGTAACTTT 
121 CCCTGGAATG AAATAAATAA ATAAAGACCG TAAGTGCTGA GATAGCGGGC CCCAAGATAT 
181 TTTTAGTCCT CTGCAATCAG CCACTAGAGG AAGGGGGAGG GAGAAGGGAG TAAAAAAGTT 
241 TTGATCCGTT CGGGAAGGGG CTCGAAGAGA ACCCTTGGGA GAAAGCAGTA GCCTCAGCTC 
301 CAAACTCAGC GAGCTTTTCT CGGCTGGCGT TTTGTCTCCT ATAGCGTAGA CTGTAAGAGA 
361 ACAGAAAGGA GTTTCCCGAG AAGATTCAGG CTGGCGTCCT GGGCTGGCCC GTCCCTTCTG 
421 GCGAGCCTCA GTGTCCTCCC ACGCGCTTCT GCCTTCCAGC CTCCTCCCTT TTTCGGGGGG 
481 CTGGCGGGAG GCATCCAAGG CACGATGTAT GTGCGCTCGC GCTCGCGCAA ATACGGCCGG 
541 AGGAGTCCTG TTCCTCGGGC ATTTTCCGAG GAAGTCTGGA TCAATTAGGC TCAGTCCGGG 
601 GAGAGCCAGC GAGCGCGCGG GCGGCGTAGC CGGCCTGTCT GGGCCGCCTC GTGGGGAGGG 
661 AGGGGGCGCC CGGCCGCCCG GCGGCGACCC CGGGGCCTGG CCGCCACC AT GG GCTTCGAG 
721 CTGGACCGCT TCGACGGCGA CGTGGACCCG GACCTGAAGT GCGCGCTGTG CCACAAGGTC 
781 CTGGAGGACC CGCTGACCAC GCCGTGCGGC CACGTCTTCT GCGCCGGCTG CGTGCTGCCC 
841 TGGGTGGTGC AGGAGGGCAG CTGCCCGGCG CGCTGCCGCG GTCGCCTGTC GGCCAAAGAG 
901 CTCAACCACG TCCTGCCGCT CAAGCGCCTT ATCCTCAAGC TGGACATCAA GTGCGCGTAC 
961 GCGACGCGCG GCTGCGGCCG GGTGGTCAAG CTGCAGCAGC TGCCGGAGCA CCTCGAGCGC 
1021 TGCGACTTCG CGCCCGCGCG CTGTCGCCAC GCGGGTTGCG GCCAGGTGCT GCTGCGGCGC 
1081 GACGTGGAGG CGCACATGCG CGACGCGTGC GACGCGCGGC CAGTGGG CCG CTGCCAGGAG 
1141 GGCTGCGGGC TACCCTTGAC GCACGGCGAG CAGCGCGCGG GCGGCCACTG CTGCG CGCGA 
1201 GCGCTGCGGG CGCACAACGG CGCGCTCCAG GCCCGCCTGG GCGCGCTGCA CAAGGCGCTC 
1261 AAGAAGGAGG CGCTGCGCGC TGGGAAGCGC GAGAAGTCGC TGGTGGCCCA GCTGGCCGCG 
1321 GCGCAGCTTG AGCTGCAGAT GACCGCGCTG CGCTACCAGA AGAAATTCAC CGAATACAGC 
1381 GCGCGCCTCG ACTCGCTCAG CCGCTGCGTG GCCGCGCCGC CCGGCGGCAA GGGCGAAGAA 
1441 ACCAAAAGTC TGACTCTTGT CCTGCATCGG GACTCCGGCT CCCTGGGATT CAATATTATT 
1501 GGTGGCCGGC CGAGTGTGGA TAACCACGAT GGATCATCCA GTGAAGGAAT CTTTGTATCC 
1561 AAGATAGTTG ACAGTGGGCC TGCAGCCAAG GAAGGAGGCC TGCAAATTCA. TGACAGGATT 
1621 ATTGAGGTCA ACGGCAGAGA CTTATCCAGA GCAACTCATG ACCAGGCTGT GGAAGCTTTC 
1681 AAGACAGCCA AGGAGCCCAT AGTGGTGCAG GTGTTGAGAA GAACACCAAG GACCAAAATG 
1741 TTCACGCCTC CATCAGAGTC TCAGCTGGTG GACACGGGAA CCCAAACCGA CATCACCTTT 
1801 GAACATATCA TGGCCCTCAC TAAGATGTCC TCTCCCAGCC CACCCGTGCT GGATCCCTAT 
1861 CTCTTGCCAG AGGAGCATCC CTCAGCCCAT GAATACTACG ATCCAAATGA CTACATTGGA 
1921 GACATCCATC AGGAGATGGA CAGGGAGGAG CTGGAGCTGG AGGAAGTGGA CCTCTACAGA 
1981 ATGAACAGCC AGGACAAGCT GGGCCTCACT GTGTGCTACC GGACGGACGA TGAAGACGAC 
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2041 ATTGGGATTT ATAT CAGTG A GATTGACCCT AACAGCATTG CAGCCAAGGA TGGGCGCATC 
2101 CGAGAAGGAG ACCGCATTAT CCAGATTAAT GGGATAGAGG TGCAGAACCG TGAAGAGGCT 
2161 GTGGCTCTTC TAAC CAGTG A AGAAAATAAA AACTTTTCAT TGCTGATTGC AAGGCCTGAA 
2221 CTCCAGCTGG ATGAGGGCTG GATGGATGAT GACAGGAACG ACTTTCTGGA TGACCTGCAC 
2281 ATGGACATGC TGGAGGAGCA GCACCACCAG GCCATGCAAT TCACAGCTAG CGTGCTGCAG 
2341 CAGAAGAAGC ACGACGAAGA CGGTGGGACC ACAGATACAG CCACCATCTT GTCCAACCAG 
2401 CACGAGAAGG ACAGCGGTGT GGGGCGGACC GACGAGAGCA CCCGTAATGA CGAGAGCTCG 
2461 GAGCAAGAGA ACAATGGCGA CGACGCCACC GCATCCTCCA ACCCGCTGGC GGGGCAGAGG 
2521 AAGCTCACCT GCAGCCAGGA CACCTTGGGC AGCGGCGACC TGCCCTTCAG CAACGAGTCT 
2581 TTCATTTCGG CCGACTGCAC GGACGCCGAC TACCTGGGGA TCCCGGTGGA CGAGTGCGAG 
2 641 CGCTTCCGCG AGCTCCTGGA GCTCAAGTGC CAGGTGAAGA GCGCCACCCC TTACGGCCTG 
2701 TACTACCCTA GCGGCCCCCT GGACG CCGGC AAGAGTGACC CTGAGAGCGT GGACAAGGAG 
2761 CTGGAGCTGC TGAACGAAGA GCTGCGCAGC ATCGAGCTGG AGTGCCTGAG CATCGTGCGC 
2821 GCCCACAAGA TGCAGCAGCT CAAGGAGCAG TACCGCGAGT CCTGGATGCT GCACAACAGC 
2881 GGCTTCCGCA ACTACAACAC CAGCATCGAC GTGCGCAGAC ACGAGCTCTC AGATATCACC 
2941 GAGCTCCCGG AGAAATCCGA CAAGGACAGC TCGAGCGCCT ACAACACAGG CGAGAGCTGC 
3001 CGCAGCACCC CGCTCACCCT GGAGATCTCC CCCGACAACT CCTTGAGGAG AGCGGCGGAG 
3061 GGCATCAGCT GCCCGAGCAG CGAAGGGGCT GTGGGGACCA CGGAAGCCTA CGGGCCAGCC 
3121 TCCAAGAATC TGCTCTCCAT CACGGAAGAT CCCGAAGTGG GCACCCCTAC CTATAGCCCG 
3181 TCCCTGAAGG AGCTGGACCC CAACCAGCCC CTGGAAAGCA AAGAGCGGAG AGCCAGCGAC 
3241 GGGAGCCGGA GCCCCACGCC CAGCCAGAAG CTGGGCAGCG CCTACCTGCC CTCCTATCAC 
3301 CACTCCCCAT ACAAGCACGC GCACATCCCG GCGCACGCCC AGCACTACCA GAGCTACATG 
3361 CAGCTGATCC AGCAGAAGTC GGCCGTGGAG TACGCGCAAA GCCAGATGAG CCTGGTGAGC 
3421 ATGTG CAAGG ACCTGAGCTC TCCCACCCCG TCGGAGCCGC GCATGGAGTG GAAGGTGAAG 
34 81 ATCCGCAGCG ACGGGACGCG CTACATCACC AAGAGGCCCG TGCGGGACCG CCTGCTGCGG 
3541 GAGCGCGCCC TGAAGATCCG GGAAG AG CGC AGCGG CATGA CCACCGACGA CGACGCGGTG 
3601 AGCGAGATGA AGATGGGGCG CTACTGGAGC AAGGAGGAGA GGAAGCAGCA CCTGGTGAAG 
3661 GCCAAGGAGC AGCGGCGGCG GCGCGAGTTC ATGATGCAGA GCAGGTTGGA TTGTCTCAAG 
3721 GAGCAGCAAG CAGCCGATGA CAGGAAGGAG ATGAACATTC TCGAACTGAG CCACAAAAAG 
37 81 ATGATGAAGA AGAGGAATAA GAAAATCTTC GATAACTGGA TGACGATCCA AGAACTCTTA 
3841 ACCCACGGCA CAAAATCCCC GGACGGCACT AGAGTATACA ATTCCTTCCT ATCGGTGACT 
3901 ACTGT ATAAT TTTCACTTCT GCATTATGTA CATAAAGGAG ACCACTACCA CTGGGGTAGA 
3961 AATTCCTGCC TCGTTCAATG CGG CAAGTTT TTGTATATAA GATAAGTACG GTCTTCATGT 
4 021 TTATAGTCCA AATTTGCAAA CCCTACAACT CTGGGTGTCA TAGGTCTATT TTAAGGGAAG 
4081 AGAGAGAAAA ACACCCTTAC TATCTTGGAA GGCAATATTA ACAAACAGAG CTTTTTTCAA 
4141 ATAGCAATTG TACTTTTCTA CCTGTACCCT TTTACATAAA GTGTTTAAAT TTCAGAAAGA 
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4201 TCTTTTATTA AGCATACTTT CACAGAATAA CTTGTTTAAA CTATATTCAT ATAAAAAAGT 
4261 TAAACACGCT TTTTTTCCTG CCTAAAACAC AAATACAACT GCCAGTATGT ATTTTTAATG 
4321 GAACCCTATT TTATAATGGT ACGTTACTGA ATGTGTTTCA TATGCGTGAC CGTTAAGATA 
4381 TTATCATTTA GGTGAAGGTT TCAACTCAAA ACCACCCAAC CCGGTGGTTA ACGATTTAAT 
4441 ACATATAACC AAACCGGCAG CGTTTAGAGT TGGGATATAC ATTTAAACAT TTTCCTGGTT 
4501 AAGGTTCCCA AGAGAGTGTA AAGGTTTTAG CAGAAAGCAA AATATCTTGC ATCTTTATGG 
4561 AAGTTTAAAG CATGTTTGCA AATATTGCAG C C CATTG AAA GAATTTGCAT GTACAGGAAA 
4621 GTTGTGGATG GAGACGGTTT GTGGAATTTT AAGTGCTCAT TGTAGTAAAC TTTTGCTTTG 
4 681 TAGATTTGAA GGTACAGACT TATACAGGCA AGTTCACAAA ATCATGATTA GTTACAAACA 
4741 GTAAAATGAA GTTAAAATAA ATTATTATTT TCT 
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1 MGFELDRFDG DVD PDL KCAL CHKVLEDPLT TPCGHVFCAG CVLPWWQEG SCPARC RGRL 
61 SAKELNHVLP LKRLILKLDI KCAYATRGCG RWKLQQLPE HLE RCDFAP A RCRHAGCGQV 
121 LLRRDVEAHM RDACDARPVG RCQEGCGLPL THGEQRAGG H CCARALRAHN GALQARLGAL 
181 HKALKKEALR AGKREKSLVA QLAAAQLELQ MTALRYQKKF TEYSARLDSL SRCVAAPPGG 
241 KGEE TKSLTL VLHRDSGSLG FNIIGGRPSV DNHDGSSSEG IFVSKIVDSG PAAKEGGLQI 
301 HDRIIEVNGR DLSRATHPQA VEAFKTAKEP IWQVLRRT P RTKMFTPPSE SQLVDTGTQT 
361 DITFEHIMAL TKMSSPSPPV LDPYLLPEEH PSAHEYYDPN DYIGDIHQEM DREELEL EEV 
421 DLYRMNSQDK LGLTVCYRTD DEDPIGIYIS EIDPNSIAAK DGRIREGDRI IQINGIEVQN 
481 REEAVALLTS EENKNFSLLI ARPEL QLDEG WMDDDRNDFL DDLHMDMLEE QHHQAMQFTA 
541 SVLQQKKHDE DGGTTDTATI LSNQHEKDSG VGRTDESTRN DESSEQENNG DDATASSNPL 
601 AGQRKLTCSQ DTLGSGDLPF SNESFISADC TDADYLGIPV DECERFRELL ELKCQVKSAT 
661 PYGLYYPSGP LDAGKSDPES VDKELELLNE ELRSIELECL SIVRAHKMQQ LKEQYRESWM 
721 LHNSGFRNYN TSIDVRRHEL SDITELPEKS DKDSSSAYNT GESCRSTPLT LEISPDNSLR 
781 RAAEGISCPS S EG AVGTTEA YGPASKNLLS ITEDPEVGTP TYS PSLKELD PNQPLESKER 
841 RASDGSRSPT PSQKLGSAYL PSYHHSPYKH AH I PAHAQH Y QSYMQLIQQK SAVEYAQSQM 
901 SLVSMCKDLS SPTPSEPRME WKVKI RSDGT RYITKRPVRD RLLRERALKI REERSGMTTD 
961 DDAVS EMKMG RYWSKEERKQ HLVKAKEQRR RREFMMQSRL DCLKEQQAAD DRKEMNILEL 
1021 SHKKMMKKRN KKIFDNWMTI QELLTHGTKS PDGTRVYNS F LSVTTV 
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la 

atcttcctcctgctctggctgtgtgaagatctgcctccttcctcttcggcttcatgcat 
gatcgtaagtttcctgaggcctcctcagccatgcttcctgcatagcctgcagaaat 

lb 

cccgggtccctcgcaaagccgctgccatcccggagggcccagccagcgggctcccggag 
gctggccgggcaggcgtggtgcgcggtaggagctgggcgcgcacggctaccgcgcgtgg 
aggagacactgccctgccgcgatgggggcccggggcgctccttcacgccgtaggcaagc 
ggggcggcggctgcggtacctgcccaccgggagctttcccttccttctcctgctgctgc 
tgctctgcatccagctcgggggaggacagaagaaaaaggag 



2-6 

These exons have been joined together as they are always 
spliced in this way* 

aatcttttagctgaaaaagtagagcagctgatggaatggagttccagacgctcaatctt 
ccgaatgaatggtgataaattccgaaaatttataaaggcaccacctcgaaactattcca 
tgattgttatgttcactgctcttcagcctcagcggcagtgttctgtgtgcaggcaagct 
aatgaagaatatcaaatactggcgaactcctggcgctattcatctgctttttgtaacaa 
gctcttcttcagtatggtggactatgatgaggggacagacgtttttcagcagctcaaca 
tgaactctgctcctacattcatgcattttcctccaaaaggcagacctaagagagctgat 
acttttgacctccaaagaattggatttgcagctgagcaactagcaaagtggattgctga 
cagaacggatgttcatattcgggttttcagaccacccaactactctggtaccattgctt 
tggccctgttagtgtcgcttgttggaggtttgctttatttgagaaggaacaacttggag 
ttcatctataacaagactggttgggccatggtgtctctgtgtatagtctttgctatgac 
ttctggccagatgtggaaccatatccgtggacctccatatgctcataagaacccacaca 
atggacaagtg 

7 

agctacattcatgggagcagccaggctcagtttgtggcagaatcacacattattctggt 
actga 

8 

atgccgctatcaccatggggatggttcttctaaatgaagcagcaacttcgaaaggcgat 
gt tggaaaaagacgga 

8+ 

This is identical to 8 except a cryptic splice acceptor 
upstream is employed, 

tttaaccattctggaacattgtgttcagagccagaaaaattaatagattttattcacat 
ctatgtctacggcttccttgacaactactgcagatgccgctatcaccatggggatggtt 
cttctaaatgaagcagcaacttcgaaaggcgatgt tggaaaaagacgga 
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9 

taatttgcctagtgggattgggcctggtggtcttcttcttcagttttctactttcaata 
tttcgttccaagtaccacggctatccttatag 

10 

tgatctggactttgagtgagaagatgtgatttggaccatggcacttaaaaactctataa 
cctcag 

11 

ctttttaattaaatgaagccaagtgggatttgcataaagtgaatgtttaccatgaagat 
aaactgttcctgactttatactattttgaattc 
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Alternative start exons 
la: 

MEWSSRRSIFRMNGDKFRKFIKAPPRNYS (encoded by exon 2) . 



lb: 




S S RRS I FRMNGDKFRK F I KAP PRN YS 



Transcript options 



2-6,7,8,9,10,11 

aatcttttagctgaaaaagtagagcagctgatggaatggagttccagacgctcaatctt 
ccgaatgaatggtgataaattccgaaaatttataaaggcaccacctcgaaactattcca 
tgattgttatgttcactgctcttcagcctcagcggcagtgttctgtgtgcaggcaagct 
aatgaagaatatcaaatactggcgaactcctggcgctattcatctgctttttgtaacaa 
gctcttcttcagtatggtggactatgatgaggggacagacgtttttcagcagGtcaaca 
tgaactctgctcctacattcatgcattttcctccaaaaggcagacctaagagagctgat 
acttttgacctccaaagaattggatttgcagctgagcaactagcaaagtggattgctga 
cagaacggatgttcatattcgggttttcagaccacccaactactctggtaccattgctt 
tggccctgttagtgtcgcttgttggaggtttgctttatttgagaaggaacaacttggag 
ttcatctataacaagactggttgggccatggtgtctctgtgtatagtctttgctatgac 
ttctggccagatgtggaaccatatccgtggacctccatatgctcataagaacccacaca 
atggacaagtgagctacattcatgggagcagccaggctcagtttgtggcagaatcacac 
attattetggtactgaatgccgctatcaccatggggatggttcttctaaatgaagcagc 
aacttcgaaaggcgatgttggaaaaagacggataatttgcctagtgggattgggcctgg 
tggtcttcttcttcagttttctactttcaatatttcgttccaagtaccacggctatcct 
tatagtgatctggactttgagtgagaagatgtgatttggaccatggcacttaaaaactc 
tataacctcagctttttaattaaatgaagccaagtgggatttgcataaagtgaatgttt 
accatgaagataaactgttcctgactttatactattttgaattc 

(MGARGAP§RRRQAGRRLRYLPTGSFPFLLLLLLLCIQLGGGQKKKENLLAEKVEQL)M 
EWS S RRS I FRMNGDKFRKFI KAPPRNYSMI VMFTALQPQRQCS VCRQANEE YQI LANS W 
RYSSAFCNKLFFSMVDYDEGTDVFQQLNMNSAPTFMHFPPKGRPKRADTFDLQRIGFAA 
EQLAKW I ADRT DVH I RVFRP PN YS GT I ALALLVS LVGGLL YLRRNNLE F I YNKT GWAMV 
SLCIVFAMTSGQMWNHIRGPPYAHKNPHNGQVSYIHGSSQAQFVAESHIILVLNAAITM 
GMVLLNEAAtSKGDVGKRRIICLVGLGLWFFFSFLLSIFRSKYHGYPYSDLDFE 
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2-6,7,8,9,11 

aatcttttagctgaaaaagtagagcagctgatggaatggagttccagacgctcaatctt 
ccgaatgaatggtgataaattccgaaaatttataaaggcaccacctcgaaactattcca 
tgattgttatgttcactgctcttcagcctcagcggcagtgttctgtgtgcaggcaagct 
aatgaagaatatcaaatactggcgaactcctggcgctattcatctgctttttgtaacaa 
gctcttcttcagtatggtggactatgatgaggggacagacgtttttcagcagctcaaca 
tgaactctgctcctacattcatgcattttcctccaaaaggcagacctaagagagctgat 
acttttgacctccaaagaattggatttgcagctgagcaactagcaaagtggattgctga 
cagaacggatgttcatattcgggttttcagaccacccaactactctggtaccattgctt 
tggccctgttagtgtcgcttgttggaggtttgctttatttgagaaggaacaacttggag 
ttcatctataacaagactggttgggccatggtgtctctgtgtatagtctttgctatgac 
ttctggccagatgtggaaccatatccgtggacctccatatgctcataagaacccacaca 
atggacaag.tgagctacattcatgggagcagccaggctcagtttgtggcagaatcacac 
attattctggtactgaatgccgctatcaccatggggatggttcttctaaatgaagcagc 
aacttcgaaaggcgatgttggaaaaagacggataatttgcctagtgggattgggcctgg 
tggtcttcttcttcagttttctactttcaatatttcgttccaagtaccacggctatcct 
tatagctttttaattaaatgaagccaagtgggatttgcataaagtgaatgtttaccatg 
aagataaactgttcctgactttatactattttgaattc 

(MGARGAPSRRRQAGRRLRYLPTGSFPFLLLLLLLCIQLGGGQKKKENLLAEKVEQL)M 
EWSSRRSIFRMNGDKFRKFIKAPPRNYSMIVMFTALQPQRQCSVCRQANEEYQILANSW 
RYSSAFCNKLFFSMVDYDEGTDVFQQLNMNSAPTFMHFPPKGRPKRADTFDLQRIGFAA 
EQLAKWI ADRTDVH I RVFRPPNYS GT I ALALLVS LVGGLL YLRRNNLE F IYNKTGWAMV 
SLCIVF^TSGQ^NHIRGPPYAHKNPHNGQVSYIHGSSQAQFVAESHIILVLNAAITM 
GMVLLNEAATSKGDVGKRRI I CLVGLGLWFFFS FLLS I FRSKYHGYP YS FLI K 

2-6, 11 

aatcttttagctgaaaaagtagagcagctgatggaatggagttccagacgctcaatctt 
ccgaatgaatggtgataaattccgaaaatttataaaggcaccacctcgaaactattcca 
tgattgttatgttcactgctcttcagcctcagcggcagtgttctgtgtgcaggcaagct 
aatgaagaatatcaaatactggcgaactcctggcgctattcatctgctttttgtaacaa 
gctcttcttcagtatggtggactatgatgaggggacagacgtttttcagcagctcaaca 
tgaactctgctcctacattcatgcattttcctccaaaaggcagacctaagagagctgat 
acttttgacctccaaagaattggatttgcagctgagcaactagcaaagtggattgctga 
cagaacggatgttcatattcgggttttcagaccacccaactactctgjgtaccattgctt 
tggccctgttagtgtcgcttgttggaggtttgctttatttgagaaggaacaacttggag 
ttcatctataacaagactggttgggccatggtgtctctgtgtatagtctttgctatgac 
ttctggccagatgtggaaccatatccgtggacctccatatgctcataagaacccacaca 
atggacaagtgctttttaattaaatgaagccaagtgggatttgcataaagtgaatgttt 
accatgaagataaactgttcctgactttatactattttgaattc 

(MGARGAPSRRRQAGRRLRYLPTGSFPFLL'LLLLLCIQLGGGQKKKENLLAEKVEQL)M 
EWS S RRS I FRMNGDKFRKFIKAPPRNYSMI VMFTALQPQRQCSVCRQANEEYQI LANSW 
RYS S AFCNKLFFSMVDYDEGTDVFQQLNMNSAPT FMHFP PKGRPKRADT FDLQRI GFAA 
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E QLAKW I ADRT DVH I RVFRP PNYS GT I ALAL LVS LVGGL L YLRRNNLE F I YNKT GWAMV 
S LCIVFAMT SGQMWNH I RG PP YAHKNPHNGQVLFN 

2-6,7,8,11 

aatcttttagctgaaaaagtagagcagctgatggaatggagttccagacgctcaatctt 
ccgaatgaatggtgataaattccgaaaatftataaaggcaccacctcgaaactattcca 
tgattgttatgttcaGtgctcttcagcctcagcggcagtgttctgtgtgcaggcaagct 
aatgaagaatatcaaatactggcgaactcctggcgctattcatctgctttttgtaacaa 
gctcttcttcagtatggtggactatgatgaggggacagacgtttttcagcagctcaaca 
tgaactctgctcctacattcatgcattttcctccaaaaggcagacctaagagagctgat 
acttttgacctccaaagaattggatttgcagctgagcaactagcaaagtggattgctga 
cagaacggatgttcatattcgggttttcagaccacccaactactctggtaccattgctt 
tggccctgttagtgtcgcttgttggaggtttgctttatttgagaaggaacaacttggag 
ttcatctataacaagactggttgggccatggtgtctctgtgtatagtctttgctatgac 
ttctggccagatgtggaaccatatccgtggacctccatatgctcataagaacccacaca 
atggacaagtgagctacattcatgggagcagccaggctcagtttgtggcagaatcacac 
attattctggtactgaatgccgctatcaccatggggatggttcttctaaatgaagcagc 
aacttcgaaaggcgatgttggaaaaagacggactttttaattaaatgaagccaagtggg 
atttgcataaagtgaatgtttaccatgaagataaactgttcctgactttatactatttt 
gaattc 

(MGARGAPSRRRQAGRRLRYLPTGSFPFLLLLLLLCIQLGGGQKKKENLIJ^KVEQL)M 
EWS SRRS I FRMNGDKFRKF IKAPPRNYSMI VMFTALQPQRQCS VCRQANEE YQI LANS W 
RYS S AFCNKLFFSMVD YDE GT DVFQQLNMNS APT FMHFP PKGRPKRADT FDLQRI GFAA 
EQLAKWIADRTDVHIRVFRPPNYSGTIALALLVSLVGGLLYLRRNNLEFIYNKTGW 
SLCIVFAMTSGQMWNHIRGPPYAHKNPHNGQVSYIHGSSQAQFVAESHIILVLNAAITM 
GMVL LNE AAT S KGD VGKRRT F 

2-6 / 8+ / 9,ll 

aatcttttagctgaaaaagtagagcagctgatggaatggagttccagacgctcaatctt 
ccgaatgaatggtgataaattccgaaaatttataaaggcaccacctcgaaactattcca 
tgattgttatgttcactgctcttcagcctcagcggcagtgttctgtgtgcaggcaagct 
aatgaagaatatcaaatactggcgaactcctggcgctattcatctgctttttgtaacaa 
gctcttcttcagtatggtggactatgatgaggggacagacgtttttcagcagctcaaca 
tgaactctgctcctacattcatgcattttcctccaaaaggcagacctaagagagctgat 
acttttgacctccaaagaattggatttgcagctgagcaactagcaaagtggattgctga 
cagaacggatgttcatattcgggttttcagaccacccaactactctggtaccattgctt 
tggccctgttagtgtcgcttgttggaggtttgctttatttgagaaggaacaacttggag 
ttcatctataacaagactggttgggccatggtgtctctgtgtatagtctttgctatgac 
ttctggccagatgtggaaccatatccgtggacctccatatgctcataagaacccacaca 
atggacaagtgtttaaccattctggaacattgtgttcagagccagaaaaattaatagat 
tttattcacatctatgtctacggcttccttgacaactactgcagatgccgctatcacca 
tggggatggttcttctaaatgaagcagcaacttcgaaaggcgatgttggaaaaagacgg 
ataatttgcctagtgggattgggcctggtggtcttcttcttcagttttctactttcaat 
atttcgttccaagtaccacggctatccttatagctttttaattaaatgaagccaagtgg 
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gatttgcataaagtgaatgtttaccatgaagataaactgttcctgactttatactattt 
tgaattc 

(MGARGAPSRRRQiAGRRLRYLPTGSFPFLLLLLLLCI^^ 

E WS S RRS I FRMNG DKFRKF I KAP P RN Y SM I VMFT ALQP QRQC S VCRQANEE YQI LANS W 
RYSSAFCNKLFFSMVDYDEGTDVFQQLNMNSAPTFMHFPPKGRPKRADTFDLQRIGFAA 
EQLAKWI ADRTDVH IRVFRPPNYS GTI ALALLVS LVGGLLYLRRNNLEFI YNKTGWAMV 
SLCIVFAMTSGQMWNHIRGPPYAHKNPHNGQVFNHSGTLCSEPEKLIDFIHIYVYGFLD 
NYCRCRYHHGDGSSK 

2-6,8 + , 11 

aatcttttagGtgaaaaagtagagcagctgatggaatggagttccagacgctcaatctt 
ccgaatgaatggtgataaattccgaaaatttataaaggcaccacctcgaaactattcca 
tgattgttatgtt.cactgctcttcagcctcagcggcagtgttctgtgtgcaggcaagct 
aatgaagaatatcaaatactggcgaaGtcctggcgctattcatctgctttttgtaacaa 
gctcttcttcagtatggtggactatgatgaggggacagacgtttttcagcagctcaaca 
tgaactctgctcctacattcatgcattttcctccaaaaggcagacctaagagagctgat 
acttttgacctccaaagaattggatttgcagctgagcaactagcaaagtggattgctga 
cagaacggatgttcatattcgggttttcagaccacccaactactctggtaccattgctt 
tggccctgttagtgtcgcttgttggaggtttgetttatttgagaaggaacaacttggag 
ttcatctataacaagactggttgggccatggtgtGtctgtgtatagtctttgctatgac 
ttctggccagatgtggaaccatatccgtggacctccatatgctcataagaacceacaca 
atggacaagtgtttaaccattctggaacattgtgttcagagccagaaaaattaatagat 
tttattcacatctatgtctacggcttccttgacaactactgcagatgccgctatcacca 
tggggatggttcttctaaatgaagcagcaacttcgaaaggcgatgttggaaaaagacgg 
actttttaattaaatgaagccaagtgggatttgcataaagtgaatgtttaccatgaaga 
taaactgttcctgactttatactattttgaattc 

(MGARGAPSRRRQAGRRLRYLPTGSFPFLLLLLLLCIQLGGGQKKKENLLAEKVEQL)M 
EWSSRRSIFRMNGDKFRKFIKAPPRNYSMIVMFTALQPQRQCSVCRQANEEYQIIJ^SW 
RYS S AFCNKL F FSMVD YDE GT DVFQQLNMNS APT FMH F P PKGRP KRADT FDLQR I GFAA 
EQLJtfCWI ADRTDVHIRVFRPPNYSGT IAIA^ 

SLCIVFAMTSGQMWNHIRGPPYAHKNPHNGQVFNHSGTLCSEPEKLIDFIHIYVYGFLD 
NYCRCRYHHGDGS S K 
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MAAR WRFWCVSVTMVVALLIVCDVPSASA 

MGARGAPSRRRQAGRRLRYLPTGS FPFLLIiLLLLC I QLGGG 

MRLLHKTLLSGLLWALFAIYAAAQ 

MLLAVYESAQ 

MNWLFLVSLVFFCGV 

MKWCSTYI I IWLAI I FHKF 
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'APWIALFVALLLGMLYMKRNS 
WTPI ITSTI I TF ITVLLFKKQS 
FNVQEFVYYFVACMWFIFIKKVI 
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iPAALC: 
fSLCIVF, 
MAVFFCFAMI 
JFVCLAITFI 
VTLSTFFIICMI| 
TNKWKLFSMILSLGILLPSI 
CCCCC****** *TM 2****** 




pj§YAHKNFKTG 
AHKNPHNG 
iVHKS-QNG 

|fmitnpntk 
lagvgpkge 
iSfiardakn- 



hvnyihgssqA 


|fva 


QVSYIHGSSQA 




GVAYIHGSSQG 




EPSFIHGSTQF 
VMYFLPNEFQH 


Ilia 


Ifai 


RIMYFSGGSGW 


|fgi 



iivllfnggvtlgmvllceaatsd 
3h 1 1 lvlnaai tmgmvllneaats k 

flVMFLNAMIVLGMILLIESGTPK 
flVGLLYALIAIGFICVNEAADQS 
FQVMVL I YGTLAALVVVLVKGI QFL 
|I FSVSLMYI VMSALSVLLI YVPKIS 
*****TM 3*****CCCCCCCC 



MDIGKR KIMCVAGIGL 1 

GDVGKR RI ICLVGLGL 

AHN-KN -RIMAMTGLVLL' 

NSKDRKNAGKKLNPLSLLNI PTNTLAIAGLVCI 

RSHLYP ETKKAYF I DAI LAS FCALFI 

CVS EKMR GLLSS FLACVL: 

ccccccccccccccccccccccc* ****TM 4**** 

TF (3) 

YSFLMS 

SDLDFE- (1) --- 

ISCSNRIDCSPVPVQVHPISFL 

SFLFA 

PLLRLSAPFK 

VF 





FLIK (2) 
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Figure 9 



C-termini of N33 splice forms 

N33_67811 Trans la ted_-_Longe 
N33_678910U Translated -_Lo 
N33_6789U_Translated_0-ong 
N33~611JTranslated_-_Longest 
N3 3~68 +9 1 1 JTran sla ted_-_Long 
N33_68+ll_Translated_-_Longe 



N33_6781 1 JTransla ted_-_Longe 
N33 6789101 ^Translated -_Lo 
N 3 3_6 7 8 9 1 1 _T ra n s 1 a t ed_-_Long 
N33_611_Translated_-_Longest 
N33_68+911_Translated_-_Long 
N33_68+UJTranslated_-_Longe 



N33_67811 Trans la ted_-_Longe 
N33_67 8 9 1 01 lJTr a n sla t ed_-_Lo 
N3 3~67 8911 _Tra n s 1 a ted_-_Lo ng 
N33~611_ Trans la ted_-_Longest 
N33_68+911_Translated_-_Long 
N33~68+ll_Translated_-_Longe 



LVS LVGGLL YL RRNN LE F I Y N Kg^^^^^^^^WN H I RG PP Y 
LVSLVGGLLYLRRNNLEFI YNK^SS^^^S^^^?^^^aQMWNHI RGPPY 
LVS LVGGLL YL RRNN LE F I Y N K^^^^I^^^^^QMWN H I RG P P Y 
LVSLVGGLLYLRi^NLEFIYNK^^^^^P^^MQMWNHIRGPPY 

lvslvggllylri^nlefiyni^^^^^^^^^Mqmwnhirgppy 

LVSLVGGLLYL RRNNLE F I YNKf^^^^^^^^^QMWNH I RG P PY 
******************************** ******* *********** 

AHKNPHNGQVSYIHGSSQAQFVAESH^Pj^^^^^^NEAATSKG 
AHKNPHNGQVSYIKGSSQAQFVAESH^^^^P^^^^LNEAATSKG 
AHKNPHNGQVS^HGSSQAQF^AESh|^^ 

AHKNPHNGQVHH 

AHKNPHNGQVFNHSG TLCSEPEKL IDFIHI YVYG— FLDNYCRCRY 

AHKNPHNGQVFNHSG -TLCSEPEKL ID FI H I YVYG — FLDNYCRCRY 

********** 

DVGKRR^^^^^^^^S FLLS I FRS ICYHGY PYS| 
DVGKRR^^^^^^^^SFLLSI FRSKYHGYPYSi 

HHGDGSSK 

HHGDGSSK 
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Figure 10 

Published GRIK4 nucleic acid sequence (accession NM_014619) . 

1 atgccccgcg tctcggcgcc tttggtgctg cttcctgcgt ggctcgtgat ggtcgcctgc 
61 agcccgcact ccttgaggat cgctgctatc ttggacgacc ccacggagtg cagcagaggg 
121 gagcggctct ccatcaccct ggccaagaac cgcatcaacc gcgctcctga gaggctgggc 
181 aaggccaagg tcgaagtgga catctttgag cttctcagag acagcgagta cgagactgca 
241 gaaaccatgt gtcagatcct ccccaagggg gtggtcgctg tcctcggacc atcgtccagc 
301 ccagcctcca gctccatcat cagcaacatc tgtggagaga aggaggtccc tcacttcaaa 
361 gtggccccag aggagttcgt caagttccag ttccagagat tcacaaccct gaacctccac 
421 cccagcaaca ctgacatcag cgtggctgta gctgggatcc tgaacttctt caactgcacc 
481 accgcctgcc tcatctgtgc caaagcagaa tgccttttaa acctagagaa gctgctccgg 
541 caattcctta tctccaagga cacgctgtcc gtccgcatgc tggatgacac ccgggacccc 
601 accccgctcc tcaaggagat ccgggacgac aagaccgcca ccatcatcat ccacgccaac 
661 gcctccatgt cccacaccat cctcctgaag gcagccgaac ttgggatggt gtcagcctat 
721 tacacataca tcttcactaa tctggagttc tcactccaga gaacggacag ccttgtggat 
781 gatcgtgtca acatcctggg attttccatt ttcaaccaat cccatgcttt cttccaagag 
841 tttgcccaga gcctcaacca gtcctggcag gagaactgtg accatgtgcc cttcactggg 
901 cctgcgctct cctcggccct gctgtttgat gctgtctatg ctgtggtgac tgcggtgcag 
961 gaactgaacc ggagccaaga gatcggcgtg aagcccttgt cctgcggctc ggcccagatc 
1021 tggcagcacg gcaccagcct catgaactac ctgcgcatgg tagaattgga aggtcttacc 
1081 ggccacattg aattcaacag caaaggccag aggtccaact acgctttgaa aatcttacag 
1141 ttcacaagga atggttttcg gcagatcggc cagtggcacg tggcagaggg cctcagcatg 
1201 gacagccacc tctatgcctc caacatctcg gacactctct tcaacaccac cctggtcgtc 
1261 accaccatcc tggaaaaccc atatttaatg ctgaagggga accaccagga gatggaaggc 
1321 aatgaccgct acgagggctt ctgtgtggac atgctcaagg agctggcaga gatcctccga 
13 81 ttcaactaca agatccgcct ggttggggat ggcgtgtacg gcgttcccga ggccaacggc 
1441 acctggacgg gaatggtcgg ggagctgatc gctaggaaag cagatctggc tgtggcaggc 
1501 ctcaccatta cagctgaacg ggagaaggtg attgatttct ctaagccatt catgactctg 
1561 ggaattagca ttctttaccg cattcatatg ggacgcaaac ccggctattt ctccttcctg 
1621 gacccatttt ctccgggcgt ctggctcttc atgcttctag cctatctggc cgtcagctgt 
1681 gtcctcttcc tggtggctcg gttgacgccc tacgagtggt acagcccaca cccatgtgcc 
1741 cagggccggt gcaacctcct ggtgaaccag tactccctgg gcaacagcct ctggtttccg 
1801 gtcggggggt tcatgcagca gggctccacc atcgcccctc gcgccttatc cacccgctgt 
1861 gtcagtggcg tctggtgggc attcacgctg atcatcatct catcctacac ggccaacctg 
1921 gcagccttcc tgaccgtgca gcgcatggat gtgcccattg agtcagtgga tgacctggct 
1981 gaccagaccg ccattgaata tggcacaatt cacggaggct ccagcatgac cttcttccaa 
2041 aattcccgct accagaccta ccaacgcatg tggaattaca tgtattccaa gcagcccagc 
2101 gtgttcgtga agagcacaga ggagggaatc gccagggtgt tgaattccaa ctacgccttc 
2161 ctcctggaat ccaccatgaa cgagtactat cggcagcgaa actgcaacct cactcagatt 
2221 gggggcctgc tggacaccaa gggctatggg attggcatgc cagtcggctc ggttttccgg 
2281 gacgagtttg atctggccat tctccagctg caggagaaca accgcctgga gatcctgaag 
2341 cgcaaatggt gggaaggagg gaagtgcccc aaggaggaag atcacagagc taaaggcctg 
2401 ggaatggaga atattggtgg aatctttgtg gttcttattt gtggcttaat cgtggccatt 
24 61 tttatggcta tgttggagtt tttatggact ctcagacact cagaagcaac tgaggtgtcc 
2521 gtctgccagg agatggtgac cgagctgcgc agcattatcc tgtgtcagga cagtatccac 
2581 ccccgccggc ggcgcgccgc agtcccgccg ccccggcccc ccatccccga ggagcgccga 
2641 ccgcggggca cggcgacgct cagcaacggg aagctgtgcg gggcagggga gcccgaccag 
2701 ctcgcgcaga gactggcgca ggaggccgcc ctggtggccc gcggctgcac gcacatccgc 
2761 gtctgccccg agtgccgccg cttccagggc ctgcgggcac ggccgtcgcc cgcccgcagc 
2821 gaggagagcc tggagtggga gaaaaccacb aacagcagcg agcccgagta g 
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Figure 11 

Published GRIK4 protein sequence (accession NP__055434) . 



GKAKVEVD I FELLRDSE YETAETMCQI LPKGWAVLGPS S S PASS S 1 1 SN I CGEKE VPH 
FKYAPEEFVKFQFQRFTTLNLHPSNTDISVAVAGILNFFNCTTACLICAKAECLLNLEK 
LLRQFL I S KDTLS VRMLDDTRD PTP LLKE I RDDKTAT III HANASMSHT I LL KAAE LGM 
VSAYYTYI FTNLEFSLQRTDSLVDDRVNILGFS I FNQSHAFFQEFAQSLNQSWQENCDH 
VPFTGPALSSALLFDAVYAVVTAVQELNRSQEIGVKPLSCGSAQIWQHGTSLMNYLRMV 
ELEGLTGH I E FNS KGQRSNYAL KI LQFTRNGFRQ I GQWHVAEGLSMDSHL YASN I SDTL 
FNTTLVVTTILENPYLMLKGNHQ 

YGVPEANGTWTGMVGELIARKADLAVAGLTITAEREKVIDFSKPFMTLGISILYRIHMG 
RKPGYFSFLDPFSPGWLFMLLAYLAVSC^FLVARLTPYEWYSPHPCAQGRCNLLVNQ 
YS LGNSLWFPVGGFMQQGST I APRALSTRCVSGVWWAFTL 1 1 1 S S YTANLAAFLTVQRM 
DVPIESVDDLADQTAIEYGTIHGGSSMTFFQNSRYQTYQRMWNYMYSKQPSVFVKSTEE 
GIARVLNSNYAFLLESTl^EYYRQRNCNLTQIGGLLDTKGYGIGMPVGSVFRDEFDLAI 
LQLQENNRLE I LKRKWWEGGKC PKE EDHRAKGLGMEN I GG I FWL I CGL I VA I FMAMLE 
FLWTLRHSEATEVSVCQEMVTELRSIILCQDSIHPRRRRAAVPPPRPPIPEERRPRGTA 
TLSNGKLCGAGEPDQLAQRLAQEAALVARGCTHIRVCPECRRFQGLRARPSPARSEESL 
EWEKTTNSSEPE 
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Cytogenetic 
Position 


Description 


Breakpoint 
YAC Clones 


Breakpoint BAC Clones 
(Acc. No.) 


2p12 


Inversion breakpoint 


915 f 7 




2q32.1 


Inversion breakpoint 


941 h 12 


RP1 1-358M9 (AC020595) 


2q21.3 


Translocation 
breakpoint 


766_c_12 


RP11-250H22 (AC011996) 


11q23.3 


Upper insertion 
breakpoint 


936_d_9 


RP11-89P5(AC009641) 


11q24.2 


Translocation/I nsertion 
breakpoint 


749_d_2 


RP11-687M24 (AP001007) 
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Figure 15 



Exon la 

GCGTGGTAGCATGTGCCTGTAATCCCAGTGCTTTGGGACACCGAGGCAGGAGGATCACT 
CGAGCCCAGGAGTGCGAGGCTGCAatgagttatgatcatac 

Exon la' 

agatttgtcttctctgccagGTGACGCTAGACTTCAGGAAGACCCCCCATTTCTGCTCC 
ACTCCTGGGCTTGGAGAAGAGTACAGCTGCTCTTGACTGGTGGGACCTTTTGCTGGCTA 
GGGGTGATGGGAGAAGCAAGAGAGGGATCCACACACCTGCGCTTAGCTTTCTATGACCT 
GGGCGGATGGAGGCCAAAGgtaaggtgggatgaga 
M E A K A 



Exon lb 

CCATGAGGATTCATAGAAGATGCCCCGCGTCTCGGCGCCTTTGGTGCTGCTTCCTGCGT 

MPRVSAPLVLLPAW 
GGCTCGTGATGGTCGCCTGCAGCCCGCACTCCTTGAGGATCGgtaagtgtqgcccaqct 
LVMVA CSPHSL RIA 

Exon 2 

gaaaccccccccaaCTGCTATCTTGGACGACCCCATGGAGTGCAGCAGAGGGGAGCGGC 

AI LDDPMECSRGERL 
TCTCCATCACCCTGGCCAAGAACCGCATCAACCGCGCTCCTGAGAGGCTGGGCAAGGCC 

S ITLAKNRINRAPERLGKA 
AAGGTCGAAGTGGACATCTTTGAGCTTCTCAGAGACAGCGAGTACGAGACTGCAGAAAC 
KVEVDIFELLRDSEYETAET 



CAgtacgtagactggq 
M 



WO 03/087408 




PCT/GB03/01543 



Page 22 of 43 
Figure 16 

Alternative nucleic acid sequence. Exons la-la' -2-etc . 

1 gcgtggtagc atgtgcctgt aatcccagtg ctttgggaca ccgaggcagg aggatcactc 
61 gagcccagga gtgcgaggct gcagtgacgc tagacttcag ga&gaccccc catttctgct 
121 ccactcctgg gcttggagaa gagtacagct gctcttgact ggtgggacct tttgctggct 
181 aggggtgatg ggagaagcaa gagagggatc cacacacctg cgcttagctt tctatgacct 
241 gggcggatgg aggccaaagc tgctatcttg gacgacccca tggagtgcag. cagaggggag 
301 cggctctcca tcaccctggc caagaaccgc atcaaccgcg ctcctgagag gctgggcaag 
361 gccaaggtcg aagtggacat ctttgagctt ctcagagaca gcgagtacga gactgcagaa 
421 accatgtgtc agatcctccc caagggggtg gtcgctgtcc tcggaccatc gtccagccca 
4 81 gcctccagct ccatcatcag caacatctgt ggagagaagg aggtccctca cttcaaagtg 
541 gccccagagg agttcgtcaa gttccagttc cagagattca caaccctgaa cctccacccc 
601 agcaacactg acatcagcgt ggctgtagct gggatcctga acttcttcaa ctgcaccacc 
661 gcctgcctca tctgtgccaa agcagaatgc cttttaaacc tagagaagct gctccggcaa 
721 ttccttatct ccaaggacac gctgtccgtc cgcatgctgg atgacacccg ggaccccacc 
781 ccgctcctca aggagatccg ggacgacaag accgccacca tcatcatcca cgccaacgcc 
841 tccatgtccc acaccatcct cctgaaggca gccgaacttg ggatggtgtc agcctattac 
901 acatacatct tcactaatct ggagttctca ctccagagaa cggacagcct tgtggatgat 
961 cgtgtcaaca tcctgggatt ttccattttc aaccaatccc atgctttctt ccaagagttt 
1021 gcccagagcc tcaaccagtc ctggcaggag aactgtgacc atgtgccctt cactgggcct 
1081 gcgctctcct cggccctgct gtttgatgct gtctatgctg tggtgactgc ggtgcaggaa 
1141 ctgaaccgga. gccaagagat cggcgtgaag cccttgtcct gcggctcggc ccagatctgg 
12 01 cagcacggca ccagcctcat gaactacctg cgcatggtag aattggaagg tcttaccggc 
1261 cacattgaat tcaacagcaa aggccagagg tccaactacg ctttgaaaat cttacagttc 
1321 acaaggaatg gttttcggca gatcggccag tggcacgtgg cagagggcct cagcatggac 
1381 agccacctct atgcctccaa catctcggac actctcttca acaccaccct ggtcgtcacc 
1441 accatcctgg aaaacccata tttaatgctg aaggggaacc accaggagat ggaaggcaat 
1501 gaccgctacg agggcttctg tgtggacatg ctcaaggagc tggcagagat cctccgattc 
1561 aactacaaga tccgcctggt tggggatggc gtgtacggcg ttcccgaggc caacggcacc 
1621 tggacgggaa tggtcgggga gctgatcgct aggaaagcag atctggctgt ggcaggcctc 
1681 accattacag ctgaacggga gaaggtgatt gatttctcta agccattcat gactctggga 
1741 attagcattc tttaccgcat tcatatggga cgcaaacccg gctatttctc cttcctggac 
1801 ccattttctc cgggcgtctg gctcttcatg cttctagcct atctggccgt cagctgtgtc 
1861 ctcttcctgg tggctcggtt gacgccctac gagtggtaca gcccacaccc atgtgcccag 
1921 ggccggtgca acctcctggt gaaccagtac tccctgggca acagcctctg gtttccggtc 
1981 ggggggttca tgcagcaggg ctccaccatc gcccctcgcg ccttatccac ccgctgtgtc 
2041 agtggcgtct ggtgggcatt cacgctgatc atcatctcat cctacacggc caacctggca 
2101 gccttcctga ccgtgcagcg catggatgtg cccattgagt cagtggatga cctggctgac 
2161 cagaccgcca ttgaatatgg cacaattcac ggaggctcca gcatgacctt cttccaaaat 
2221 tcccgctacc agacctacca acgcatgtgg aattacatgt attccaagca gcccagcgtg 
2281 ttcgtgaaga gcacagagga gggaatcgcc agggtgttga attccaacta cgccttcctc 
2341 ctggaatcca ccatgaacga gtactatcgg cagcgaaact gcaacctcac tcagattggg 
2401 ggcctgctgg acaccaaggg ctatgggatt ggcatgccag tcggctcggt tttccgggac 
2461 gagtttgatc tggccattct ccagctgcag gagaacaacc gcctggagat cctgaagcgc 
2521 aaatggtggg aaggagggaa gtgccccaag gaggaagatc acagagctaa aggcctggga 
2581 atggagaata ttggtggaat ctttgtggtt cttatttgtg gcttaatcgt ggccattttt 
2641 atggctatgt tggagttttt atggactctc agacactcag aagcaactga ggtgtccgtc 
2701 tgccaggaga tggtgaccga gctgcgcagc attatcctgt gtcaggacag tatccacccc 
2761 cgccggcggc gcgccgcagt cccgccgccc cggcccccca tccccgagga gcgccgaccg 
2821 cggggcacgg cgacgctcag caacgggaag ctgtgcgggg caggggagcc cgaccagctc 
2881 gcgcagagac tggcgcagga ggccgccctg gtggcccgcg gctgcacgca catccgcgtc 
2941 tgccccgagt gccgccgctt ccagggcctg cgggcacggc cgtcgcccgc ccgcagcgag 
3001 gagagcctgg agtgggagaa aaccaccaac agcagcgagc ccgagtag 
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Figure 17 

Complete alternative protein sequence 

MEAKAAILDDPMECSRGERLSITLAKNRINRAPERLGKAKVE 
MCQILFKGv~vAVLGF3 

SNTDISVAVAGILNFFNCTTACLICAKAEC^ 

TPLLKEIRDDKTATI I IHANASMSHTILLKAAELGMVSAYYTYIFTNLEFSLQRTDSLV 
DDRYN I LGFS I FNQSHAFFQEFAQSLNQSWQENCDHVPFTGPALSSALLFDAVYAWTA 
VQELNRSQEIGVKPLSCGSAQIWQHGTSLMNYLRMVELEGLTGHIEFNSKGQRSNYALK 
I LQFTRNGFRQ I GQWHVAEGLSMDSHL YASN I SDTLFNTTLWTT I LENP YLMLKGNHQ 
EMEGNDRYEGFCVDMLKEIJ^ILRFNYKIRLVGDGVYGVPEANGTWTGMVGELIARKAD 
LAVAGLTITAEREKVIDFSKPFMTLGISILYRIHMGRKPGYFSFLDPFSPGVWLFMLLA 
YLAVSCVLFLVARLTPYEWYSPHPCAQGRCNLLVNQYSLGNSLWFPVGGFMQQGSTIAP 
RALSTRCVSGVWWAFTLIIISSYTANLAAFLT^ 

GSSMTFFQNSRYQTYQRMWNYMYSKQPSVFVKSTEEGIARVLNSNYAFLLESTMNEYYR 
QRNCNLTQIGGLLDTKGYGIGMPVGSV]FRDEFDIiAILQLQENNRLEILKRKWWEGGKCP 
KEEDHRAKGLGMENIGGIFVVLICGLIVAIFMAMLEFLWTLRHSEATEVSVCQEMVTEL 
RS I ILCQDS IHPRRRRAAVPPPRPPI PEERRPRGTATLSNGKLCGAGEPDQLAQRLAQE 
AALVARGCTHIRVCPECRRFQGLRARPSPARSEESLEWEKTTNSSEPE 
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Figure 18 

NPAS3 (NM_022123) nucleic acid sequence (spliceform lb-3- 
4etc) 

1 ccacgcgtcc gacgcccccc acccgggagg ggggagagag gcaaaaagta agagaggaaa 
61 aaaaatagca ggaagatggc gcccaccaag cccagctttc agcaggatcc ttccaggcga 
121 gaacgtttac aagcattgag aaaggagaaa tcccgagatg ctgctcgctc ccgccgggga 
181 aaagaaaact ttgagttcta tgaattggcc aagttgttgc ctcttcctgc agccattacc 
241 agccagctcg acaaggcatc catcattcga cttacaatta gctatctgaa aatgagggac 
301 tttgctaacc agggggaccc tccgtggaac ttgcgaatgg aaggccctcc acctaacaca 
361 tcagtaaaag gtgcacagcg aaggagaagc cccagtgcac tagccattga agtatttgaa 
421 gcacatttgg gaagccacat tttgcagtcc ctggatggct ttgtatttgc actaaatcag 
481 gaaggaaaat ttttgtacat ttccgaaaca gtctccatct acctaggcct ctcacaagtg 
541 gage tgacag gcagcagtgt ctttgactat gtccaccccg gagatcacgt ggagatggct 
601 gagcagctgg gcatgaagct cccccctggg cggggtctcc tgtcacaggg cactgetgag 
661 gaeggageca gctcagcatc ttcctcctct cagteggaga cccccgagcc agtggagtca 
721 accagcccca gtctgetaac cactgacaac actcttgagc gttccttttt catccgaatg 
781 aaatctactc tgaccaaacg cggtgtgcac atcaaatcat caggatataa ggtgattcac 
841 ataacaggee ggctacgcct gagagtgtcg ctgtcccacg ggaggaccgt ccccagccaa 
901 atcatgggtc tcgtggttgt tgcgcatgcc ttgcctcccc ctacgatcaa tgaagtcaga 
961 attgactgee atatgttcgt cactcgagta aatatggacc tcaatatcat ttactgtgaa 
1021 aataggatta gtgattatat ggatctgacc cctgtagata tegtagggaa gagatgetae 
1081 cacttcatcc atgetgaaga cgtggagggc atcaggcaca gtcacttgga ettgetgaat 
1141 aagggtcagt gtgtgacaaa gtactatege tggatgeaga agaaeggagg atatatttgg 
1201 atacagtcca gtgccaccat agctattaat gecaagaatg caaatgaaaa gaatatcatc 
1261 tgggtgaatt accttcttag caatcctgag tacaaggaca cacccatgga catcgcacag 
1321 ctcccccatc tgccggagaa aacttccgaa tecteggaga catccgactc tgagtcagac 
1381 tctaaagaca cctcaggtat tacagaggac aacgagaact ccaagtccga cgagaagggg 
1441 aaccagtccg agaacagega agacceggag cccgaccgga agaagteggg caacgcgtgt 
1501 gacaacgaca tgaactgcaa cgacgacggc cacagctcca gtaacccgga cagccgcgac 
1561 agegacgaca gcttcgagca cteggacttt gagaacccca aggegggega ggaeggctte 
1621 ggtgctctgg gegegatgea gatcaaggtg gagegctacg tggagagega gtcggacctg 
1681 eggctgeaga actgegagtc actcacgtcc gacagcgcca aggactegga cagcgcaggc 
1741 gaggegggeg cgcaggcctc cagcaagcac cagaagegea agaaaaggcg gaaacggcaa 
1801 aagggeggea gcgccagccg ccggcgcctg tccagcgcgt cgagcccagg cggcctggac 
1861 gcgggcctgg tggagccccc gcggctgctg tcctccccca acagtgcctc ggtgctcaag 
1921 ateaagaegg agatctcaga acccatcaat ttcgacaatg acagcagcat ctggaactac 
1981 ccgcccaacc gggagatctc caggaacgag tccccctaca gcatgaccaa gccccccagc 
2041 tctgagcact tcccgtcccc geagggegge ggcggtgggg gtggcggtgg eggggggctg 
2101 cacgtggcca ttcccgactc ggtcctcacc ccgcccggcg ccgacggcgc ggccgcccgc 
2161 aagactcagt tcggcgcctc ggccaccgcg gccctggccc ccgtcgcctc cgacccgctg 
2221 tcacccccgc tctcggcgtc cccgcgggac aagcaccccg ggaaeggegg egggggeggg 
22 81 ggcgggggcg gcggcgcggg gggeggegge cccagcgcgt ccaactcctt gctgtacact 
2341 ggggacctgg aggegctgea gaggttgcag gcgggcaacg tcgtgctccc gctggtgcac 
24 01 agggtgaccg ggaccctggc cgccaccagc acggccgcgc agagggtcta caccacgggc 
2461 accatccgct acgcgcccgc cgaggtgacc ctggccatgc agagcaacct gctgcccaac 
2521 gcgcacgctg ttaacttcgt ggacgttaac agccccggct ttggcctcga ccccaagacg 
2581 cccatggaga tgctctacca ccacgtgcac cggctcaaca tgtcaggacc gtteggegge 
2641 gcagtgagcg cagctagcct gaegcagatg cccgccggca acgtgttcac cacggccgag 
2701 ggactcttct ccacgctgcc cttccccgtc tacagcaacg gcatccacgc ggcacagact 
2761 ctggagcgca aggaggactg aggcgccgcc cgtcctgggc ccggccaggc cccgcttgga 
2821 ggaggcatcg teggcatttt cgtttagacc tttaattcta gcactttgaa ttcgagcagg 
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2881 tcagcgtctt ctctcgccac gacggtcccc 
2941 attctttcgt gtaaagatat gtttattttt 
3001 tgccgttttg tcttcttcta aggtgtgtgt 
3061 ttaagatgtc tttcatgtgt atatgcctct 
3121 gagttctcaa gtgacaacca ttggggtttc 
3181 aaagagacaa gcataaacaa tgtgccctgt 
3241 ttgtttctgt tcctaattcc tttaaaaaac 
3301 gaatttaatt ctctttttac ggttaagatt 
3361 atttgggttc ttaaacttaa tttctggcct 
3421 cctcgtgc 



of 43 

attccacccc ctctttcttt cacctgactt 
tgccttcaga gggtcagacg accagttgcc 
tgggttgttt tgctttcctt tgcatcttta 
gccatagaat actcagtctt gtggtcaaga 
ttcataaaga tcttgatatg atcaagatgg 
ttgactaagt caaatgaaat agggtggttt 
aggggga.au a. y uattttaga attttiatigca 
ttaagatttt cttacttgca cataaaaata 
gtgactagaa tgtttaaaaa aaaaaaaaac 



Figure 19 

NPAS3 protein sequence (spliceform lb-3-4etc.) 

KASIIRLTISYLKMRDFANQGDPPWNLRMEGPPPNTSVKGAQRRRSPSALAIEVFEAHL 

6SH I LQS LDGFVFALNQEGKFL YI SETVS I YLGLSQVELTGS SVFDYVHPGDHVEMAEQ 

LGMKIiPPGRGLLSQGTAEDGASSASSSSQSETPEPVESTSPSLLTTDNTLERSFFIRMK 

STLTKRGVHIKSSGYKVIHITGRLRLRVSLSHGRTVPSQIMGLVVVAHALPPPTINEVR 

IDCHMFVTRVNMDLNIIYCENRISDYMDLTPVDIVGKRCYHFIHAEDVEGIRH 

NKGQCVTKYYRWMQKNGGYIWIQSSATIAINAKNANEKNIIWVNYLLSNPEYKDTPMDI 

AQLPHLPEKTSESSETSDSESDSKDTSGITEDNENSKSDEKGNQSENSEDPEPDRKKSG 

NACDNDMNCNDDGHSSSNPDSRDSDDSFEHSDFENPKAGEDGFGAIjGAMQIKVERYVES 

ESDLRLQNCESLTSDSAKDSDSAGEAGAQASSKHQKRKKRRKRQKGGSASRRRLSSASS 

PGGLDAGLVEPPRLLSSPNSASVLKIKTEISEPINFDNDSSIWNYPPNREISRNESPYS 

MTKPPSSEHFPS PQGGGGGGGGGGGLHVAI PDSVIiTPPGADGAAARKTQFGASATAALA 

PVASDPLSPPLSASPRDKHPGNGGGGGGGGGGAGGGGPSASNSLLYTGDLEALQRLQAG 

NVVLPLVHRVTGTLAATSTAAQRVYTTGTIRYAPAEVTLAMQSNL 

PGFGLDPKTPMEMLYHHVHRLNMSGPFGGAVSAASLTQMPAGNVFTTAEGLFSTLPFPV 
YSNGIHAAQTLERKED 
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Figure 20 

NPAS3 nucleic acid sequence (spliceform incorporating exons 
la-2-3-4etc) similar to mouse cDNA with accession number 
NM_013780) 

1 ATGGGGAGGG CCGGCGCCGC GGCCAACGGC ACCCCGCAGA ACGTCCAGGG CATCACCTCC 
61 TACCAGCAGC GAATAACTGC CCAGCATCCT CTGCCCAACC AATCAGAATG TAGGAAAATC 
121 TACAGATATG ACGGAATCTA CTGTGAATCT ACCTACCAGA ATTTACAAGG ATTGAGAAAG 
181 GAGAAATCCC GAGATGCTGC TCGCTCCCGC CGGGGAAAAG AAAACTTTGA GTTCTATGAA 
241 TTGGCCAAGT TGTTGCCTCT TCCTGCAGCC ATTACCAGCC AGCTCGACAA GGCATCCATC 
301 ATTCGACTTA CAATTAGCTA TCTGAAAATG AGGGACTTTG CTAACCAGGG GGACCCTCCG 
361 TGGAACTTGC GAATGGAAGG CCCTCCACCT AACACATCAG TAAAAGGTGC ACAGCGAAGG 
421 AGAAGCCCCA GTGCACTAGC CATTGAAGTA TTTGAAG CAC ATTTGGGAAG CCACATTTTG 
481 CAGTCCCTGG ATGGCTTTGT ATTTGCACTA AATCAGGAAG GAAAATTTTT GTACATTTCC 
541 GAAACAGTCT CCATCTACCT AGG CCTCTCA CAAGTGGAGC TGACAGGCAG CAGTGTCTTT 
601 GACTATGTCC ACCCCGGAGA TCACGTGGAG ATGGCTGAGC AGCTGGG CAT GAAGCTCCCC 
661 CCTGGGCGGG GTCTCCTGTC ACAGGGCACT GCTGAGGACG GAGCCAGCTC AGCATCTTCC 
721 TCCTCTCAGT CGGAGACCCC CGAGCCAGTG GAGTCAACCA GCCCCAGTCT GCTAACCACT 
781 GACAACACTC TTGAGCGTTC CTTTTTCATC CGAATGAAAT CTACTCTGAC CAAACGCGGT 
841 GTGCACATCA AATCATCAGG ATATAAGGTG ATTCACATAA CAGGCCGGCT ACGCCTGAGA 
901 GTGTCGCTGT CCCACGGGAG GACCGTCCCC AGCCAAATCA TGGGTCTCGT GGTTGTTGCG 
961 CATGCCTTGC CTCCCCCTAC GATCAATGAA GTCAGAATTG ACTGCCATAT GTTCGTCACT 
1021 CGAGTAAATA TGGACCTCAA TATCATTTAC TGTGAAAATA GGATTAGTGA TTATATGGAT 
1081 CTGACCCCTG TAGATATCGT AGGGAAGAGA TGCTACCACT TCATCCATGC TGAAGACGTG 
1141 GAGGGCATCA GGCACAGTCA CTTGGACTTG CTGAATAAGG GTCAGTGTGT GACAAAGTAC 
1201 TATCGCTGGA TGCAGAAGAA CGGAGGATAT ATTTGGATAC AGTCCAGTGC CACCATAGCT 
1261 ATTAATGCCA AGAATGCAAA TGAAAAGAAT ATCATCTGGG TGAATTACCT TCTTAGCAAT 
1321 CCTGAGTACA AGGACACACC CATGGACATC GCACAGCTCC CCCATCTGCC GGAGAAAACT 
1381 TCCGAATCCT CGGAGACATC CGACTCTGAG TCAGACTCTA AAGACACCTC AGGTATTACA 
1441 GAGGACAACG AGAACTCCAA GTCCGACGAG AAGGGGAACC AGTCCGAGAA CAGCGAAGAC 
1501 CCGGAGCCCG ACCGGAAGAA GTCGGGCAAC GCGTGTGACA ACGACATGAA CTGCAACGAC 
1561 GACGGCCACA GCTCCAGTAA CCCGGACAGC CGCGACAGCG ACGACAG CTT CGAGCACTCG 
1621 GACTTTGAGA ACCCCAAGGC GGGCGAGGAC GGCTTCGGTG CTCTGGGCGC GATGCAGATC 
1681 AAGGTGGAGC GCTACGTGGA GAGCGAGTCG GACCTGCGGC TGCAGAACTG CGAGTCACTC 
1741 ACGTCCGACA GCGCCAAGGA CTCGGACAGC GCAGGCGAGG CGGGCGCGCA GGCCTCCAGC 
1801 AAGCACCAGA AG CGCAAGAA AAGGCGGAAA CGGCAAAAGG GCGGCAGCGC CAGCCGCCGG 
1861 CGCCTGTCCA GCGCGTCGAG CCCAGGCGGC CTGGACGCGG GCCTGGTGGA GCCCCCGCGG 
1921 CTGCTGTCCT CCCCCAACAG TGCCTCGGTG CTCAAGATCA AGACGGAGAT CTCAGAACCC 
1981 ATCAATTTCG ACAATGACAG CAGCATCTGG AACTACCCGC CCAACCGGGA GATCTCCAGG 
2041 AACGAGTCCC CCTACAGCAT GACCAAGCCC CCCAGCTCTG AGCACTTCCC GTCCCCGCAG 
2101 GGCGGCGGCG GTGGGGGTGG CGGTGGCGGG GGG CTGCACG TGGCCATTCC CGACTCGGTC 
2161 CTCACCCCGC CCGGCGCCGA CGGCGCGGCC GCCCGCAAGA CTCAGTTCGG CGCCTCGGCC 
2221 ACCGCGGCCC TGGCCCCCGT CGCCTCCGAC CCGCTGTCAC CCCCGCTCTC GGCGTCCCCG 
2281 CGGGACAAGC ACCCCGGGAA CGGCGGCGGG GGCGGGGGCG GGGGCGGCGG CGCGGGGGGC 
2341 GGCGGCCCCA GCGCGTCCAA CTCCTTGCTG TACACTGGGG ACCTGGAGGC GCTGCAGAGG 
2401 TTGCAGGCGG GCAACGTCGT GCTCCCGCTG GTGCACAGGG TGACCGGGAC CCTGGCCGCC 
2461 ACCAGCACGG CCGCGCAGAG GGTCTACACC ACGGGCACCA TCCGCTACGC GCCCGCCGAG 
2521 GTGACCCTGG CCATGCAGAG CAACCTGCTG CCCAACGCGC ACGCTGTTAA CTTCGTGGAC 
2581 GTTAACAGCC CCGGCTTTGG CCTCGACCCC AAGACGCCCA TGGAGATGCT CTACCACCAC 
2641 GTGCACCGGC TCAACATGTC AGGACCGTTC GGCGGCGCAG TGAGCGCAGC TAGCCTGACG 
2701 CAGATGCCCG CCGGCAACGT GTTCACCACG GCCGAGGGAC TCTTCTCCAC GCTGCCCTTC 
2761 CCCGTCTACA GCAACGGCAT CCACGCGGCA CAGACTCTGG AGCGCAAGGA GGACTGAGGC 
2821 GCCGCCCGTC CTGGGCCCGG CCAGGCCCCG CTTGGAGGAG GCATCGTCGG CATTTTCGTT 
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2881 TAGACCTTTA ATTCTAGCAC TTTGAATTCG AGCAGGTCAG CGTCTTCTCT CGCCACGACG 
2941 GTCCCCATTC CACCCCCTCT T 

Figure 21 

NPAS3 protein sequence of spliceform incorporating exons 



KEKSRDAARSRRGKENFEFYELAKLLPLPAAI TSQLDKAS I IRLT I S YLKMRDFANQGD 
PPWNLRMEGPPPNTSVKGAQRRRSPSALAIEVFEAHLGSHILQSLDGFVFALNQEGKFL 
YISETVSIYLGLSQVELTGSSVFDYVHPGDHVEMAEQLGMKLPPGRGLLSQGTAEDGAS 
SASSSSQSETPEPVESTSPSLLTTDNTLERSFFIRMKSTLTKRGVHIKSSGYKVIHITG 
RIjRLRVSLSHGRTVPSQIMGLVVVAHALPPPTINEVRIDC^ 

I SDYMDLTPVDI VGKRC YHF IHAEDVEG IRHSHLDLLNKGQCVTKYYRWMQKNGGYIWI 
QSSATIAINAKNANEKNIIWVNYLLSNPEYKDTPMDIAQLPHLPEKTSESSETSDSESD 
SKDTSGITEDNENSKSDEKGNQSENSEDPEPDRKKSGNACDNDMNCNDDGHSSSNPDSR 
DSDDSFEHSDFENPKAGEDGFGALGAMQIKVERYVESESDLRLQNCESLTSDSAKDSDS 
AGEAGAQASSKHQKRKKRRKRQKGGSASRRRLSSASSPGGLDAGLVEPPRLLSSPNSAS 
VLKIKTEISEPINFDNDSSIWNYPPNREISRNESPYSMTKPPSSEHFPSPQGGGGGGGG 
GGGLHVAIPDSVLTPPGADGAAARKTQFGASATAALAPVASDPLSPPLSASPRDKHPGN 
GGGGGGGGGGAGGGGPSASNSLLYTGDLEALQRLQAGNWLPLVHRVTGTLAATSTAAQ 
RVYTTGT I RYAPAEVTLAMQSNLLPNAHAVN F VDVNS PGFGLDPKT PMEML YHHVHRLN 
MSGPFGGAVSAASLTQMPAGNVFTTAEGLFSTLPFPVYSNGIHAAQTLERKED 



la-2-3-4etc. 




QALR 



WO 03/087408 




PCT/GB03/01543 



Page 28 of 43 
Figure 22 




9 
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Figure 25 

PDE4B1 (acc. L20966) Nucleic acid sequence 

l gcggccgcgg cggtgcagca gaggcgcctc gggcaggagg . agggcggctt ctgcgagggc 
61 agcctgaggt attaaaaagt gtcagcaaac tgcattgaat aacagacatc ctaagagggg 
121 atattttcca cctctataat gaagaaaagc aggagtgtga tgacggtgat ggctgatgat 
181 aatgttaaag attattttga atgtagcttg agtaaatcct acagttcttc cagtaacaca 
241 cttgggatcg acctctggag agggagaagg tgttgctcag gaaacttaca gttaccacca 
301 ctgtctcaaa gacagagtga aagggcaagg actcctgagg gagatggtat ttccaggccg 
361 accacactgc ctttgacaac gcttccaagc attgctatta caactgtaag ccaggagtgc 
421 tttgatgtgg aaaatggccc ttccccaggt cggagtccac tggatcccca ggccagctct 
4 81 tccgctgggc tggtacttca cgccaccttt cctgggcaca gccagcgcag agagtcattt 
541 ctctacagat cagacagcga ctatgacttg tcaccaaagg cgatgtcgag aaactcttct 
601 cttccaagcg agcaacacgg cgatgacttg attgtaactc cttttgccca ggtccttgcc 
661 agcttgcgaa gtgtgagaaa caacttcact atactgacaa accttcatgg tacatctaac 
721 .aagaggtccc cagctgctag tcagcctcct gtctccagag tcaacccaca agaagaatct 
781 tatcaaaaat tagcaatgga aacgctggag gaattagact ggtgtttaga ccagctagag 
841 accatacaga cctaccggtc tgtcagtgag atggcttcta acaagttcaa aagaatgctg 
901 aaccgggagc tgacacacct ctcagagatg agccgatcag ggaaccaggt gtctgaatac 
961 atttcaaata ctttcttaga caagcagaat gatgtggaga tcccatctcc tacccagaaa 
1021 gacagggaga aaaagaaaaa gcagcagctc atgacccaga taagtggagt gaagaaatta 
1081 atgcatagtt caagcctaaa caatacaagc atctcacgct ttggagtcaa cactgaaaat 
1141 gaagatcacc tggccaagga gctggaagac ctgaacaaat ggggtcttaa catctttaat 
1201 gtggctggat attctcacaa tagaccccta acatgcatca tgtatgctat attccaggaa 
1261 agagacctcc taaagacatt cagaatctca tctgacacat ttataaccta catgatgact 
1321 ttagaagacc attaccattc tgacgtggca tatcacaaca gcctgcacgc tgctgatgta 
1381 gcccagtcga cccatgttct cctttctaca ccagcattag acgctgtctt cacagatttg 
1441 gagatcctgg ctgccatttt tgcagctgcc atccatgacg ttgatcatcc tggagtctcc 
1501 aatcagtttc tcatcaacac aaattcagaa cttgctttga tgtataatga tgaatctgtg 
1561 ttggaaaatc atcaccttgc tgtgggtttc aaactgctgc aagaagaaca ctgtgacatc 
1621 ttcatgaatc tcaccaagaa gcagcgtcag acactcagga agatggttat tgacatggtg 
1681 ttagcaactg atatgtctaa acatatgagc ctgctggcag acctgaagac aatggtagaa 
1741 acgaagaaag ttacaagttc aggcgttctt ctcctagaca actataccga tcgcattcag 
1801 gtccttcgca acatggtaca ctgtgcagac ctgagcaacc ccaccaagtc cttggaattg 
1861 tatcggcaat ggacagaccg catcatggag gaatttttcc agcagggaga caaagagcgg 
1921 gagaggggaa tggaaattag cccaatgtgt gataaacaca cagcttctgt ggaaaaatcc 
1981 caggttggtt tcatcgacta cattgtccat ccattgtggg agacatgggc agatttggta 
2041 cagcctgatg ctcaggacat tctcgatacc ttagaagata acaggaactg gtatcagagc 
2101 atgatacctc aaagtccctc accaccactg gacgagcaga acagggactg ccagggtctg 
2161 atggagaagt ttcagtttga actgactctc gatgaggaag attctgaagg acctgagaag 
2221 gagggagagg gacacagcta tttcagcagc acaaagacgc tttgtgtgat tgatccagaa 
2281 aacagagatt ccctgggaga gactgacata gacattgcaa cagaagacaa gtcccccgtg 
2341 gatacataat ccccctctcc ctgtggagat gaacattcta tccttgatga gcatgccagc 
24 01 tatgtggtag ggccagccca ccatgggggc caagacctgc acaggacaag ggccacctgg 
2461 cctttcagtt acttgagttt ggagtcagaa agcaagacca ggaagcaaat agcagctcag 
2521 gaaatcccac ggttgacttg ccttgatggc aagcttggtg gagagggctg aagctgttgc 
2581 tgggggccga ttctgatcaa gacacatggc ttgaaaatgg aagacacaaa actgagagat 
2641 cattctgcac taagtttcgg gaacttatcc ccgacagtga ctgaactcac tgactaataa 
2701 cttcatttat gaatcttctc acttgtccct ttgtctgcca acctgtgtgc cttttttgta 
2761 aaacattttc atgtctttaa aatgcctgtt gaatacctgg agtttagtat caacttctac 
2821 acagataagc tttcaaagtt gacaaacttt tttgactctt tctggaaaag ggaaagaaaa 
2881 tagtcttcct tctttcttgg gcaatatcct tcactttact acagttactt ttgcaaacag 
2941 acagaaagga tacacttcta accacatttt acttccttcc cctgttgtcc agtccaactc 
3001 cacagtcact cttaaaactt ctctctgttt gcctgcctcc aacagtactt ttaacttttt 



WO 03/087408 



PCT/GB03/01543 



Page 32 of 43 



3061 gctgtaaaca gaataaaatt gaacaaatta gggggtagaa aggagcagtg gtgtcgttca 
3X21 ccgtgagagt ctgcatagaa ctcagcagtg tgccctgctg tgtcttggac cctgcaatgc 
3181 ggccgc 



QTYRSVSEMASNKFKRMLNRELTHLSEMSRSGNQVSEYISNTFLDKQNDVEIPSPTQKD 

REKKKKQQLMTQI SGVKKLMHSSSLNNf S I SRFGVNTENEDHLAKE LEDLNKWGLN I FN 

VAGYSHNRPLTCIMYAIFQERDLLKTFRISSDTFITYMMTLEDHYHSDVAYHNSLHAAD 

VAQSTHVLLSTPALDAVFTDLEILAAIFAAAIH^ 

SVLENHHLAVGFKLLQEEHCDIFMNLTKKQRQTLRKMVIDMVLATD 

MVETKKVTSSGVLLLDNYTDRIQVLRNMVHCADLSNPTKSLELYRQWTDRIMEEFF 

DKERERGME I S PMCDKHTAS VE KSQVGF I D Y I VHPLWETWADLVQ PDAQD I LDTLEDNR 

NWYQSMIPQSPSPPLDEQNRDCQGLMEKFQFELTLDEEDSEGPEKEGEGHSYFSSTKTL 

CVIDPENRDSLGETDIDIATEDKSPVDT 



Figure 26 



PDE4B1 Protein sequence 
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Figure 27 

PDE4B3 (acc. U85048) Nucleic acid sequence 

1 atgacagcaa aagattcttc aaaggaactt actgcttctg aacctgaggt ttgcataaag 
61 actttcaagg agcaaatgca tttagaactt gagcttccga gattaccagg aaacagacct 
121 acatctccta aaatttctcc acgcagttca ccaaggaact caccatgctt tttcagaaag 
181 ttactggtga ataaaagcat tcggcagcgt cgtcgcttca ctgtggctca tacatgcttt 
241 gatgtggaaa atggcccttc cccaggtcgg agtccactgg atccccaggc cagctcttcc 
3 01 gctgggctgg tacttcacgc cacctttcct gggcacagcc agcgcagaga gtcatttctc 
361 tacagatcag acagcgacta tgacttgtca ccaaaggcga tgtcgagaaa ctcttctctt 
421 ccaagcgagc aacacggcga tgacttgatt gtaactcctt ttgcccaggt ccttgccagc 
481 ttgcgaagtg tgagaaacaa cttcactata ctgacaaacc ttcatggtac atctaacaag 
541 aggtccccag ctgctagtca gcctcctgtc tccagagtca acccacaaga agaatcttat 
601 caaaaattag caatggaaac gctggaggaa ttagactggt gtttagacca gctagagacc 
661 atacagacct accggtctgt cagtgagatg gcttctaaca agttcaaaag aatgctgaac 
721 cgggagctga cacacctctc agagatgagc cgatcaggga accaggtgtc tgaatacatt 
781 tcaaatactt tcttagacaa gcagaatgat gtggagatcc catctcctac ccagaaagac 
841 agggagaaaa agaaaaagca gcagctcatg acccagataa gtggagtgaa gaaattaatg 
901 catagttcaa gcctaaacaa tacaagcatc tcacgctttg gagtcaacac tgaaaatgaa 
961 gatcacctgg ccaaggagct ggaagacctg aacaaatggg gtcttaacat ctttaatgtg 
1021 gctggatatt ctcacaatag acccctaaca tgcatcatgt atgctatatt ccaggaaaga 
1081 gacctcctaa agacattcag aatctcatct gacacattta taacctacat gatgacttta 
1141 gaagaccatt accattctga cgtggcatat cacaacagcc tgcacgctgc tgatgtagcc 
1201 cagtcgaccc atgttctcct ttctacacca gcattagacg ctgtcttcac agatttggag 
1261 atcctggctg ccatttttgc agctgccatc catgacgttg atcatcctgg agtctccaat 
1321 cagtttctca tcaacacaaa ttcagaactt gctttgatgt ataatgatga atctgtgttg 
1381 gaaaatcatc accttgctgt gggtttcaaa ctgctgcaag aagaacactg tgacatcttc 
1441 atgaatctca ccaagaagca gcgtcagaca ctcaggaaga tggttattga catggtgtta 
1501 gcaactgata tgtctaaaca tatgagcctg ctggcagacc tgaagacaat ggtagaaacg 
1561 aagaaagtta caagttcagg cgttcttctc ctagacaact ataccgatcg cattcaggtc 
1621 cttcgcaaca tggtacactg tgcagacctg agcaacccca ccaagtcctt ggaattgtat 
1681 cggcaatgga cagaccgcat catggaggaa tttttccagc agggagacaa agagcgggag 
1741 aggggaatgg aaattagccc aatgtgtgat aaacacacag cttctgtgga aaaatcccag 
1801 gttggtttca tcgactacat tgtccatcca ttgtgggaga catgggcaga tttggtacag 
1861 cctgatgctc aggacattct cgatacctta gaagataaca ggaactggta tcagagcatg 
1921 atacctcaaa gtccctcacc accactggac gagcagaaca gggactgcca gggtctgatg 
1981 gagaagtttc agtttgaact gactctcgat gaggaagatt ctgaaggacc tgagaaggag 
2041 ggagagggac acagctattt cagcagcaca aagacgcttt gtgtgattga tccagaaaac 
2101 agagattccc tgggagagac tgacatagac attgcaacag aagacaagtc ccccgtggat 
2161 aca 
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Figure 28 



PDE4B3 Protein sequence 




RMLNRELTHLSEMSRSGNQVSEYISNTFLDKQNDVEIPSPTQKDREKKKKQQLMTQISG 
VKKLMHS SSLNNTS I SRFGVNTENEDHLAKELEDLNKWGLNI FNVAGYSHNRPLTCIMY 
AIFQERDLLKTFRISSDTFITYMMTLEDirraSDVAYHNSLHAADVAQSTHVLLSTPALD 
AVFTDLEILAAIFAAAIHDVDHPGVSNQFLINTNSELALMYiro 
QEEHCDIFMNLTKKQRQTLRKMVIDMVL^^ 

DNYTDRIQVLRNMVHCADLSNPTKSLELYRQWTDRIMEEFFQQGDKERERGMEISPMCD 
KHTASVEKSQVGFIDYIVHPLWETWADLVQPDAQDILDTLEDNRNWYQSMIPQSPSPPL 
DEQNRDCQGLMEKFQFELTLDEEDSEGPEKEGEGHSYFSSTKTLCVIDPENRDSLGETD 
ID I ATEDKS PVDT 
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Figure 29 

PDE4B2 (acc. NM_002600) Nucleic acid seqeunce 

1 gaattcctcc tctcttcacc ccgttagctg ttttcaatgt aatgctgccg tccttctctt 
61 gcactgcctt ctgcgctaac acctccattc ctgtttataa ccgtgtattt attacttaat 
121 gtatataatg taatgttttg taagttatta atttatatat ctaacattgc ctgccaatgg 
181 tggtgttaaa tttgtgtaga aaactctgcc taagagttac gactttttct tgtaatgttt 
241 tgtattgtgt attatataac ccaaacgtca cttagtagag acatatggcc cccttggcag 
301 agaggacagg ggrtgggcttt tgttcaaagg gtctgccctt tccctgcctg agttgctact 
361 tctgcacaac ccctttatga accagttttc acccgaattt tgactgtttc atttagaaga 
421 aaagcaaaat gagaaaaagc tttcctcatt tctccttgag atggcaaagc actcagaaat 
481 gacatcacat accctaaaga accctgggat gactaaggca gagagagtct gagaaaactc 
541 tttggtgctt ctgcctttag ttttaggaca catttatgca gatgagctta taagagaccg 
601 ttccctccgc cttcttcctc agaggaagtt tcttggtaga tcaccgacac ctcatccagg 
661 cggggggttg gggggaaact tggcaccagc catcccaggc agagcaecac tgtgatttgt 
721 tctcctggtg gagagagctg gaaggaagga gccagcgtgc aaataatgaa ggagcacggg 
781 ggcaccttca gtagcaccgg aatcagcggt ggtagcggtg actctgctat ggacagcctg 
841 cagccgctcc agcctaacta catgcctgtg tgtttgtttg cagaagaatc ttatcaaaaa 
901 ttagcaatgg aaacgctgga ggaattagac tggtgtttag accagctaga gaccatacag 
961 acctaccggt ctgtcagtga gatggcttct aacaagttca aaagaatgct gaaccgggag 
1021 ctgacacacc tctcagagat gagccgatca gggaaccagg tgtctgaata catttcaaat 
1081 actttcttag acaagcagaa tgatgtggag atcccatctc ctacccagaa agacagggag 
1141 aaaaagaaaa agcagcagct catgacccag ataagtggag tgaagaaatt aatgcatagt 
1201 tcaagcctaa acaatacaag catctcacgc tttggagtca acactgaaaa tgaagatcac 
1261 ctggccaagg agctggaaga cctgaacaaa tggggtctta acatctttaa tgtggctgga 
1321 tattctcaca atagacccct aacatgcatc atgtatgcta tattccagga aagagacctc 
1381 ctaaagacat tcagaatctc atctgacaca tttataacct acatgatgac tttagaagac 
1441 cattaccatt ctgacgtggc atatcacaac agcctgcacg ctgctgatgt agcccagtcg 
1501 acccatgttc tcctttctac accagcatta gacgctgtct tcacagattt ggagatcctg 
1561 gctgccattt ttgcagctgc catccatgac gttgatcatc ctggagtctc caatcagttt 
1621 ctcatcaaca caaattcaga acttgctttg atgtataatg atgaatctgt gttggaaaat 
1681 catcaccttg ctgtgggttt caaactgctg caagaagaac actgtgacat cttcatgaat 
1741 ctcaccaaga agcagcgtca gacactcagg aagatggtta ttgacatggt gttagcaact 
1801 gatatgtcta aacatatgag cctgctggca gacctgaaga caatggtaga aacgaagaaa 
1861 gttacaagtt caggcgttct tctcctagac aactataccg atcgcattca ggtccttcgc 
1921 aacatggtac actgtgcaga cctgagcaac cccaccaagt ccttggaatt gtatcggcaa 
1981 tggacagacc gcatcatgga ggaatttttc cagcagggag acaaagagcg ggagagggga 
2041 atggaaatta gcccaatgtg tgataaacac acagcttctg tggaaaaatc ccaggttggt 
2101 ttcatcgact acattgtcca tccattgtgg gagacatggg cagatttggt acagcctgat 
2161 gctcaggaca ttctcgatac cttagaagat aacaggaact ggtatcagag catgatacct 
2221 caaagtccct caccaccact ggacgagcag aacagggact gccagggtct gatggagaag 
2281 tttcagtttg aactgactct cgatgaggaa gattctgaag gacctgagaa ggagggagag 
2341 ggacacagct atttcagcag cacaaagacg ctttgtgtga ttgatccaga aaacagagat 
2401 tccctgggag agactgacat agacattgca acagaagaca agtcccccgt ggatacataa 
2461 tccccctctc cctgtggaga tgaacattct atccttgatg agcatgccag ctatgtggta 
2521 gggccagccc accatggggg ccaagacctg cacaggacaa gggccacctg gcctttcagt 
2581 tacttgagtt tggagtcaga aagcaagacc aggaagcaaa tagcagctca ggaaatccca 
2641 cggttgactt gccttgatgg caagcttggt ggagagggct gaagctgttg ctgggggccg 
2701 attctgatca agacacatgg cttgaaaatg gaagacacaa aactgagaga tcattctgca 
2761 ctaagtttcg ggaacttatc cccgacagtg actgaactca ctgactaata acttcattta 
2821 tgaatcttct cacttgtccc tttgtctgcc aacctgtgtg ccttttttgt aaaacatttt 
2881 catgtcttta aaatgcctgt tgaatacctg gagtttagta tcaacttcta cacagataag 
2941 ctttcaaagt tgacaaactt ttttgactct ttctggaaaa gggaaagaaa atagtcttcc 
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3001 ttctttcttg ggcaatatcc ttcactttac tacagttact tttgcaaaca gacagaaagg 
3061 atacacttct aaccacattt tacttccttc ccctgttgtc cagtccaact ccacagtcac 
3121 tcttaaaact tctctctgtt tgcctgcctc caacagtact tttaactttt tgctgtaaac 
3181 agaataaaat tgaacaaatt agggggtaga aaggagcagt ggtgtcgttc accgtgagag 
3241 tctgcataga actcagcagt gtgccctgct gtgtcttgga ccctgccccc cacaggagtt 
3301 gctacagtcc ctggccctgc ttcccatcct cctctcttca ccccgttagc tgttttcaat 
3361 gtaatgctgc cgtccttctc ttgcactgcc ttctgcgcta acacctccat tcctgtttat 
3421 aaccgtgtat ttattactta atgtatataa tgtaatgttt tgtaagttat taatttatat 
3481 atctaacatt gcctgccaat ggtggtgtta aatttgtgta gaaaactctg cctaagagtt 
3541 acgacttttt cttgtaatgt tttgtattgt gtattatata acccaaacgt cacttagtag 
3601 agacatatgg cccccttggc agagaggaca ggggtgggct tttgttcaaa gggtctgccc 
3661 tttccctgcc tgagttgcta cttctgcaca acccctttat gaaccagttt tggaaacaat 
3721 attctcacat tagatactaa atggtttata ctgagtcttt tacttttgta tagcttgata 
3781 ggggcagggg caatgggatg tagtttttac ccaggttcta tccaaatcta tgtgggcatg 
3841 agttgggtta taactggatc ctactatcat tgtggctttg gttcaaaagg aaacactaca 
3901 tttgctcaca gatgattctt ctgattcttc tgaatgctcc cgaactactg actttgaaga 
3961 ggtagcctcc tgcctgccat taagcaggaa tgtcatgttc cagttcatta caaaagaaaa 
4021 caataaaaca atgtgaattt ttataataaa aaaaaaaaaa aggaattc 



Figure 30 

PDE4B2 Protein sequence 



|E E S YQKLAMETLE E LDWCLD 
QLET I QT YRS VSEMASNKFKRMLNRELTHLS EMSRSGNQVS E Y I SNTPLDKQNDVE IPS 
PTQKDREKKKKQQLMTQISGVKKLMHSSSLNNTSISRFGVNTENEDHLA^ 
LNIFNVAGYSHNRPLTCIMYAIFQERDLLKTFRISSDTFITYMMTLEDHYHSDVAYHNS 
LHAADVAQSTHVLLSTPALDAVFTDLEILAAIFAAAIHDVDHPGVSNQFLINTNSELAL 
MYNDESVLENHHLAVGFKLLQEEHCDIFMNLTKKQRQTLRKMVID^^ 
ADLKTMVETKKVTSSGVLLLDNYTDRIQVLRNMVHCADLSNPTKSLELYRQW 
FFQQGDKERERGMEISPMCDKHTASVEKSQVGFIDYIVHPLWETWADLVQPDAQDILDT 
LEDNRNWYQSMIPQSPSPPLDEQNRDCQGLMEKFQFELTLDEEDSEGPEKEGEGHSYFS 
S TKTLCVI DPENRDS LGETD ID I ATEDKS PVDT 
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Figure 31 
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Figure 32 
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Figure 33 
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Figure 34 
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Figure 35 

1 agccatttgt gaacctggag gcttgacatt cgccagcgca gggccccaca agagaaattt 
61 caatgaaaag aaaagccaat ggattgtggt cttagaaaag ctgcttagat gatgtctgtt 
121 tcccgtgcta tagacacgtg gcagagctgt aagtaaatgc tcggcactgc atgatgaatt 
181 ggatggctgc agaccggaga caaaaaaaat aattgtctca ttttcgtggt gatttgctta 
241 actggtggga cc atg ccaga acggctagcg gaaatgctct tggatctctg gactccatta 
301 ataatattat ggattactct tcccccttgc atttacatgg ctccgatgaa tcagtctcaa 
361 gttttaatga gtggatcccc tttggaacta aacagtctgg gtgaagaaca. gcgaattttg 
421 aaccgctcca aaagaggctg ggtttggaat caaatgtttg tcctggaaga gttttctgga 
481 cctgaaccga ttcttgttgg ccggctacac acagacctgg atcctgggag caaaaaaatc 
541 aagtatatcc tatcaggtga tggagctggg accatatttc aaataaatga tgtaactgga 
601 gatatccatg ctataaaaag acttgaccgg gaggaaaagg ctgagtatac cctaacagct 
661 caagcagtgg actgggagac aagcaaacct ctggagcctc cttctgaatt tattattaaa 
721 gttcaagaca tcaatgacaa tgcaccagag tttcttaatg gaccctatca tgctactgtg 
781 ccagaaatgt ccattttggg tacatctgtc actaacgtca ctgcgaccga cgctgatgac 
841 ccagtttatg gaaacagtgc aaagttggtt tatagtatat tggaagggca gccttatttt 
901 tccattgagc ctgaaacagc tattataaaa actgcccttc ccaacatgga cagagaagcc 
961 aaggaggagt acctggttgt tatccaagcc aaagatatgg gtggacactc tggtggcctg 
1021 tctgggacca cgacacttac agtgactctt actgatgtta atgacaatcc tccaaaattt 
1081 gcacagagcc tgtatcactt ctcagtaccg gaagatgtgg ttcttggcac tgcaatagga. 
1141 agggtgaagg ccaatgatca ggatattggt gaaaatgcac agtcatcata tgatatcatc 
1201 gatggagatg gaacagcact ttttgaaatc acttctgatg cccaggccca ggatggcatt 
1261 ataaggctaa gaaaacctct ggactttgag accaaaaaat cctatacgct aaaggtagag 
1321 gcagccaatg tccatattga cccacgcttc agtggcaggg ggccctttaa agacacggcg 
1381 acagtcaaaa tcgtggttga agatgctgat gagcctccgg tcttctcttc accgacttac 
1441 ctacttgaag ttcatgaaaa tgctgctcta aactccgtga ttgggcaagt gactgctcgt 
1501 gaccctgata tcacttccag tcctataagg ttttccatcg accggcacac tgacctggag 
1561 aggcagttca acattaatgc agacgatggg aagataacgc tggcaacacc acttgacaga 
1621 gaattaagtg tatggcacaa cataacaatc attgctactg aaattaggaa ccacagtcag 
16 81 atatcacgag tacctgttgc tattaaagtg ctggatgtca atgacaacgc ccctgaattc 
1741 gcatccgaat atgaggcatt tttatgtgaa aatggaaaac ccggccaagt cattcaaact 
1801 gttagcgcca tggacaaaga tgatcccaaa aacggacatt atttcttata cagtctcctt 
1861 ccagaaatgg tcaacaatcc gaatttcacc atcaagaaaa atgaagataa ttccctcagt 
1921 attttggcaa agcataatgg attcaaccgc cagaagcaag aagtctatct tttaccaatc 
1981 ataatcagtg atagtggaaa tcctccactg agcagcacta gcaccttgac aatcagggtc 
2041 tgtggctgca gcaatgacgg tgtcgtccag tcttgcaatg tcgaagctta tgtccttcca 
2101 attggactca gtatgggcgc cttaattgcc atattagcat gcatcatttt gctgttagtc 
2161 atcgtggtgc tgtttgtaac tctacggcgg cataaaaatg aaccattaat tatcaaagat 
2221 gatgaagacg ttcgagaaaa catcattcgc tacgatgatg aaggaggagg ggaggaggac 
2281 acagaggctt ttgacattgc aactttacaa aatccagatg gaattaatgg atttttaccc 
2341 cgtaaggata ttaaaccaga tttgcagttt atgccaaggc aagggcttgc tccagttcca 
24 01 aatggtgttg atgtcgatga atttataaat gtaaggctgc atgaggcaga taatgatccc 
2461 acggccccgc catatgactc cattcagata tatggctatg aaggccgagg gtcagtggct 
2521 ggctccctca gctccttgga gtccaccaca tcagactcag accagaattt tgactacctc 
2581 agtgactggg gtccccgctt taagagactg ggcgaactct actctgttgg tgaaagtgac 
2641 aaagaaact t ga cagtggat tataaataaa tcactggaac tgagcattct gtaatattct 
2701 agggtcactc cccttagata caaccaatgt ggctatttgt tttagaggca agtttagcac 
2761 cagtcatcta taaactcaac cacattttaa tgttgaacca aaaaaagata ataaaataaa 
2821 aaagtatatg ttaggaggtt ataaatcttg tggagtgtga attaagtatg tggagtgtct 
2881 agaagtcctt ggatatttga tatttacctg accaccacag acaaagatt 
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Figure 36 

1 MPERLAEMLL DLWTPLIILW ITLPPCIY MPi PMNQSQVLMS GSPLELNSLG EEQRILNRSK 
61 RGWVWMQMFV LEEFS GPEPI LVGRLHTDLD PGSKKIKYIL SGDGAGTIFO INDVTGDIHA 
121 I KRLDREEKA EYTLTAQAVD WETSKPLEPP SEFIIKVQD I NDNAPEFLNG P YHATVPEMS 
181 ILGTSVTNVT ATDADDPVYG NSAKLVYSIL EGQPYFSIEP ETAIIKTALP NMDREAKEEY 
241 LWIQAKDM G GHSGGLSGTT TLTVTLTDVN DNPPKFAQSL YHFSVPEDW LGTAI GRVKA 
301 NDQDIGENAQ SSYDIIDGDG TALFEITSDA QAQDGIIRLR KPLDFETKKS YTLKVEAANV 
361 HIDPRFSGRG PFKDTATVKI WED ADEPPV FSSP TYLLEV HENAALNSVI GQVTARDPDI 
421 TSSPIRFSID RHTDLERQFN INADDGKITL ATPLDRELSV WHNITIIATE IRNHSQISRV 
481 PVAIKVLD VN DNAPE FAS E Y EAFLCENGKP GQjglQTVSAM DKDDPKNGHY FLYSLLPEMV 
541 NNPNFTIKKN EDNSLSILAK HNGFNRQKQE VYLLPIIISD SGNPPLSSTS TLTIRVCG CS 
601 NDGWQSCNV EAYVLPIGLS MGALIAIIAC IILLLVIWL FVTLR RHKNE PLIIKDDEDV 
661 REN I IRYDDE GGGEEDTEAF DIATLQNPDG INGFLPRKDI KPDLQFMPRQ GLAPVPNGVD 
721 VDEFINVRLH EADNDPTAPP YDSIQIYGYE GRGSVAGSLS SLESTTSDSD QNFDYLSDWG 
781 PRFKRLGELY SVGESDKET 
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Figure 37 

a) 

MPERLAEMLLDLWTPLIILWITLPPCIY^ 
KRGWVWNQMFVLEEFSGPEPILVGRVLKSVSKLH* 



b) 

GRGGAAEAPRAGGGRLLRGQ 

3 ggccgcggcggtgcagcagaggcgcctcgggcaggaggagggcggcttctgcgagggcag 62 
PELHTDLDPGSK.KIKYILSG^ 

63 cctgagctacacacagacctggatcctgggagcaaaaaaatcaagtatatcctatcaggt 122 

DGAGTI FQINDVTGDIHAIK 
123 gatggagctgggaccatatttcaaataaatgatgtaactggagatatccatgctataaaa 182 

R L D REE KAEYTLTAQAVDWE 
183 agacttgaccgggaggaaaaggctgagtataccctaacagctcaagcagtggactgggag 242 

TSKPLEPPSEFI I K V Q D I N D 
243 acaagcaaacctctggagcctccttctgaatttattattaaagttcaagacatcaatgac 302 

NAPEFLNG PYHA TVPEfjs I L 
303 aatgcaccagagtttcttaatggaccctatcatgctactgtgccagaaatgtccattttg 362 

G T S VTN VT ATDAD D PVYG N S ~ 
363 ggtacatctgtcactaacgtcactgcgaccgacgctgatgacccagtttatggaaacagt 422 

AKLVYS ILEGQPYFSI EPET 
423 gcaaagttggtttatagtatattggaagggcagccttatttttccattgagcctgaaaca 482 

ai i ktalpnHdreakeeyl v 

4 83 gctattataaaaactgcccttcccaacatggacagagaagccaaggaggagtacctggtt 542 

V IQAKD@GGHSGGIjSGTTTIj 
543 gttatccaagccaaagatatgggtggacactctggtggcctgtctgggaccacgacactt 602 

TVTLTDVNDNPPKFAQSLYH 
603 acagtgactcttactgatgttaatgacaatcctccaaaatttgcacagagcctgtatcac 662 

fsvpedvvlgtai. grvkand 

663 ttctcagtaccggaagatgtggttcttggcactgcaataggaagggtgaaggccaatgat 722 

QDIGENAQSSYDI IDGDGTA 
723 caggatattggtgaaaatgcacagtcatcatatgatatcatcgatggagatggaacagca 782 

LFE ITSDAQAQ DG I IRLR KP 
783 ctttttgaaatcacttctgatgcccaggcccaggatggcattataaggctaagaaaacct 842 

LDFETKKSYTLKVEAANVHI 
843 ctggactttgagaccaaaaaatcctatacgctaaaggtagaggcagccaatgtccatatt 902 

DPRFSGRGP FKDTATVKI VV 
903 gacccacgcttcagtggcagggggccctttaaagacacggcgacagtcaaaatcgtggtt 962 

EDADEPPVFSSPTYLLEVHE 
963 gaagatgctgatgagcctccggtcttctcttcaccgacttacctacttgaagttcatgaa 1022 

NAALNSVI GQVTAR 
1023 aatgctgctctaaactccgtgattgggcaagtgactgctcgt etc . 
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